Comparison of the Hansch and Free-Wilson Approaches to Structure

the Free-Wilson method for assigning additivity constants to structural ... CRAIG. Comparison of Hansch and Free-Wilson Approaches. 117 of the coeffic...
0 downloads 0 Views 1MB Size
8 Comparison of the Hansch and Free-Wilson Approaches to Structure-Activity Correlation

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

PAUL N. CRAIG Craig Chemical Consulting Services, Inc., Ambler, Pa. 19002

The basic principles on which the Hansch multiple parameter method for structure-activity correlation depends are described. These are compared with the basic features of the Free-Wilson method for assigning additivity constants to structural features of related compounds. An example is given for which the two methods of analysis have led to similar structure-activity relationships. Factors which determine the particular method to use in a new situation are discussed. The Free-Wilson method is presented in considerable operational detail with special emphasis on the detection and avoidance of situations which lead to singularity problems in solution of the matrix. Favorable analyses, which result in additivity constants that can be correlated with known physical constants, may lead to predictions for new compounds not covered in the original matrix.

/

T h e t w o m e t h o d s of s t r u c t u r e - a c t i v i t y c o r r e l a t i o n w h i c h h a v e r e c e i v e d ,

the most a p p l i c a t i o n i n t h e past d e c a d e are t h e H a n s c h m u l t i p l e p a r a m e t e r m e t h o d , or the so-called e x t r a t h e r m o d y n a m i c a p p r o a c h , a n d the F r e e - W i l s o n , o r a d d i t i v e m o d e l . T h e basic differences a n d similarities of these m e t h o d s are discussed i n this presentation. The Hansch Multiple

Parameter Method

S i n c e this entire b o o k has b e e n s t r u c t u r e d a r o u n d this m e t h o d , i t w i l l not b e discussed i n great d e t a i l , b u t t h e h i g h l i g h t s of t h e m e t h o d w i l l b e s u m m a r i z e d . T h e reader is r e f e r r e d to a n excellent r e v i e w b y H a n s c h ( 1 ), w h i c h b o t h describes t h e m e t h o d a n d places i t i n a p r o p e r h i s t o r i c a l perspective. 115 In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

116

BIOLOGICAL CORRELATIONS

T H E HANSCH APPROACH

T h e m u l t i p l e p a r a m e t e r a p p r o a c h tests for s i m p l e m a t h e m a t i c a l e q u a ­ tions w h i c h c a n relate the b i o l o g i c a l activities of a series of

closely

r e l a t e d c o m p o u n d s to one or m o r e p h y s i c a l parameters, w h i c h m a y

be

m e a s u r e d or c a l c u l a t e d for these c o m p o u n d s .

be

T h e parameters m a y

u s e d s i n g l y or together, or i n c o m b i n a t i o n as l i n e a r a n d s q u a r e d terms w h i c h c a n s h o w p a r a b o l i c relationships. M a n y possible c o m b i n a t i o n s of parameters c a n be c o n s i d e r e d , a n d m u l t i p l e regression analysis is e m ­ p l o y e d to o b t a i n statistical parameters for a l l c o m b i n a t i o n s of parameters s t u d i e d . T h e b i o l o g i c a l d a t a are a s s u m e d to be m o r e v a r i a b l e a n d less Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

a c c u r a t e l y d e t e r m i n e d t h a n the p h y s i c a l parameters. H e n c e the b i o l o g i c a l d a t a are assigned the role of the d e p e n d e n t v a r i a b l e , a n d the p h y s i c a l parameters are c o n s i d e r e d as i n d e p e n d e n t v a r i a b l e s i n the regression. T h e statistical parameters w h i c h result f r o m the regression

enable

one to reject those r e l a t i o n s h i p s w h i c h are not s t a t i s t i c a l l y significant a n d to choose f r o m those equations, w h i c h d o pass s t a n d a r d statistical tests for significance, the ones w h i c h best e x p l a i n the o b s e r v e d b i o l o g i c a l data. O f course, c o m m o n sense m u s t still p l a y a role i n the e v a l u a t i o n of m e a n ­ i n g f u l equations, b u t the s t a t i s t i c a l parameters c a n b e p o w e r f u l guides, e s p e c i a l l y i n rejecting or q u e s t i o n i n g the v a l i d i t y of r e l a t i o n s h i p s ex­ pressed b y the m a t h e m a t i c a l equations.

T y p i c a l equations w h i c h often

are f o u n d to relate b i o l o g i c a l activities w i t h p h y s i c a l parameters i n c l u d e :

L o g ί = a + b%

L o g ^ = a + 6x +

(1)

L o g ^ = a + 6x — c x

Log ί T h e s y m b o l s π, σ, a n d E

= a + 6x s

(2)

CJ

2

cx

2

+ άσ

(3)

+ da + eE.

(4)

refer to the substituent constants for p a r t i t i o n ,

p o l a r , a n d steric factors ( J ). I n p r a c t i c e , w h a t is sought is u s u a l l y the simplest e q u a t i o n w h i c h is not i m p r o v e d b y a d d i t i o n of f u r t h e r terms. B y " i m p r o v e m e n t " is m e a n t a statistically significant r e d u c t i o n i n the o v e r a l l v a r i a n c e . W h e n s u c h a n e q u a t i o n is o b t a i n e d , s u b s t i t u t i o n of t h e p h y s i c a l parameters for

sub­

stituents not yet s t u d i e d c a n be m a d e , a n d the e q u a t i o n leads to a p r e ­ d i c t i o n of the b i o l o g i c a l a c t i v i t y of the u n p r e p a r e d

compound.

A d d i t i o n a l structure—activity i n f o r m a t i o n c a n be o b t a i n e d w h e n a p a r a b o l i c r e l a t i o n s h i p i n the p a r t i t i o n t e r m is o b s e r v e d , p r o v i d e d the sign

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

8.

Comparison

CRAIG

of the coefficient

of Hansch

and Free-Wilson

Approaches

f o r the s q u a r e d t e r m is negative.

117

A n optimal value

for the p a r t i t i o n factor c a n b e o b t a i n e d b y d i f f e r e n t i a t i n g t h e e q u a t i o n w i t h respect to π o r P; this results i n t h e o p t i m a l F v a l u e ( l o g F ° ) . T h i s c a n b e a v a l u a b l e g u i d e i n the d e s i g n of n e w molecules w h i c h m a y differ c o n s i d e r a b l y

i n structure f r o m those

s t u d i e d i n t h e regression

analysis. I f a p o s i t i v e coefficient is o b t a i n e d f o r the square of p a r t i t i o n t e r m , t h e e q u a t i o n m u s t be rejected as it i m p l i e s that a n y c h a n g e i n p a r ­ t i t i o n i n g w i l l result i n a n i n c r e a s e d b i o l o g i c a l a c t i v i t y ; s u c h a m i n i m u m has never b e e n e n c o u n t e r e d .

O f course this a p p l i e s to t h e use of l o g

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

1 / C as the m e t h o d f o r q u a n t i t a t i n g the b i o l o g i c a l activities of the c o m ­ p o u n d s u n d e r study. T h e most c o m m o n error i n a p p l i c a t i o n of this m e t h o d lies i n a l a c k of a p p r e c i a t i o n of the m i n i m u m statistical r e q u i r e m e n t s i n v o l v e d . T h u s , one needs to h a v e a b o u t five w e l l - c h o s e n c o m p o u n d s

for every variable

t e r m i n a H a n s c h analysis i n order to feel confident a b o u t the results. F o r e x a m p l e , a n e q u a t i o n s u c h as E q u a t i o n 2 a b o v e s h o u l d b e d e r i v e d f r o m 10 or m o r e c o m p o u n d s , a n d o n e s u c h as E q u a t i o n 3, f r o m 15 or m o r e examples.

A smaller n u m b e r of examples p e r t e r m m a y l e a d to u s e f u l

results, b u t o n e cannot often s u p p o r t these results b y statistics. A f r e q u e n t abuse is seen w h e n a large n u m b e r of v a r i a b l e terms are u s e d i n a c o m ­ p l e x e q u a t i o n ( f o u r o r m o r e t e r m s ) w h i c h w a s d e r i v e d f r o m o n l y 10 o r 12 examples.

T h e statistician w o u l d prefer to h a v e 15 to 20 m o r e

com­

p o u n d s t h a n the degrees of f r e e d o m i n t h e r e s u l t i n g e q u a t i o n ; n o t often is this l u x u r y met. The Free-Wilson Additivity

Model

U n l i k e the H a n s c h a p p r o a c h , i n this m o d e l no assumptions are m a d e c o n c e r n i n g p h y s i c a l parameters w h i c h m a y p l a y a role i n d e t e r m i n i n g the b i o l o g i c a l a c t i v i t y . I n s t e a d , a series of de novo substituent constants is o b t a i n e d u s i n g o n l y t h e e x p e r i m e n t a l l y o b t a i n e d b i o l o g i c a l test d a t a a n d the f o l l o w i n g b a s i c a s s u m p t i o n : e v e r y t i m e a p a r t i c u l a r substituent g r o u p appears at t h e same p l a c e i n t h e m o l e c u l e , i t is a s s u m e d that i t w i l l p l a y a constant role t o w a r d s d e t e r m i n i n g t h e o v e r - a l l b i o l o g i c a l a c t i v i t y of t h e m o l e c u l e (2).

It m a y c o n t r i b u t e to, or detract f r o m , the o v e r - a l l

b i o l o g i c a l a c t i v i t y , b u t i t m u s t a l w a y s p l a y the same role. T h i s basic a s s u m p t i o n is c h e c k e d b y means of t h e statistical p a r a m ­ eters w h i c h result f r o m

s o l u t i o n of the m a t r i x , w h i c h expresses

a s s u m p t i o n stated a b o v e i n t h e f o l l o w i n g e q u a t i o n f o r e a c h Biological A c t i v i t y = μ + Σ

GiXi

w h e r e μ is t h e average b i o l o g i c a l a c t i v i t y , a n d GiX

t

c o n t r i b u t i o n f o r the i

t h

g r o u p at t h e i

t h

the

compound:

represents the a c t i v i t y

position. I n structuring the matrix,

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

118

BIOLOGICAL CORRELATIONS

T H E HANSCH APPROACH

X becomes 0 or 1, i n d i c a t i n g t h e absence or presence

of a p a r t i c u l a r

g r o u p at p o s i t i o n X . T h e m a t r i x thus represents a series of equations i n m u l t i p l e u n k n o w n s , one e q u a t i o n for each c o m p o u n d .

Its s o l u t i o n gives

the values f o r the de novo substituent constants for e v e r y substituent at each position ( = G / X ; ) .

( A m o r e c o m p l e t e discussion is g i v e n b e l o w . )

If t h e statistical parameters o b t a i n e d u p o n s o l u t i o n of t h e m a t r i x i n d i c a t e that t h e a d d i t i v i t y a s s u m p t i o n is v a l i d , t h e de novo constants c a n t h e n be u s e d to p r e d i c t t h e a c t i v i t y of ( a ) those c o m p o u n d s

used i n

d e r i v a t i o n of the constants a n d ( b ) a l l possible c o m b i n a t i o n s of t h e v a r ­ Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

ious groups at e a c h p o s i t i o n . T h i s is not m u c h of a s a v i n g w h e n one has o n l y t w o or three positions of a m o l e c u l e w h i c h c a n b e s u b s t i t u t e d , b u t i n m o r e c o m p l e x situations this c a n be a p o w e r f u l tool. A n e x a m p l e is g i v e n b e l o w , w h e r e six different positions of the p h e n a n t h r e n e r i n g w e r e substituted w i t h

three, three, six, three, six, a n d three

substituents,

respectively: HO—CH—CH —Β 2

Β = 3 groups R i = 3 groups R2 = 3 groups T h i s represents 3 χ 3 χ 3 χ 6 χ 6 χ 3

R3 = 6 groups R = 6 groups R = 3 groups 6

7

= 2916 possible c o m p o u n d s .

A

g o o d F r e e - W i l s o n analysis w a s o b t a i n e d f r o m o n l y 42 of t h e possible analogs; t h e p r e p a r a t i o n of these 42 c o m p o u n d s

enables one to p r e d i c t

w i t h a f a i r assurance t h e a p p r o x i m a t e a n t i m a l a r i a l a c t i v i t y to b e expected for almost 2900 u n p r e p a r e d analogs. T h e m i n i m u m n u m b e r of c o m p o u n d s

r e q u i r e d for a F r e e - W i l s o n

t y p e of analysis w i l l v a r y d e p e n d i n g u p o n t h e n u m b e r of positions s u b ­ s t i t u t e d a n d t h e n u m b e r of substituents at each p o s i t i o n .

T h e formula

for t h e absolute m i n i m u m r e q u i r e d to p e r m i t a s o l u t i o n of t h e c o m p l e x set of equations i n m u l t i p l e u n k n o w n s is Ν = +

1 +

( A — 1) -\- (Β — 1 )

( C — 1) -f- . . . , w h e r e A, B, C are t h e n u m b e r of substituents at e a c h

position.

T h i s m i n i m u m s h o u l d b e exceeded b y 1 0 - 2 0 c o m p o u n d s a l ­

t h o u g h u s e f u l results c a n sometimes result f r o m as f e w as five c o m p o u n d s i n excess of the m i n i m u m . T h e f u n d a m e n t a l p o i n t w h i c h differentiates b e t w e e n t h e t w o m e t h ­ ods is t h e f o l l o w i n g : t h e H a n s c h m e t h o d seeks f o r correlations b e t w e e n

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

8.

Comparison

CRAIG

of Hansch

and Free-Wilson

119

Approaches

v a r i a b l e b i o l o g i c a l activities a n d v a r i a b l e p h y s i c a l parameters w h e r e a s the F r e e - W i l s o n m e t h o d uses o n l y the b i o l o g i c a l activities as v a r i a b l e terms, a l o n g w i t h exact i n f o r m a t i o n as to the presence or absence of e a c h substituent g r o u p .

T h e r e f o r e , the de novo substituent constants w h i c h

result e m b o d y a l l factors, k n o w n or u n k n o w n , t h a t p l a y roles i n determ i n i n g the b i o l o g i c a l a c t i v i t y of the p a r t i c u l a r c o m p o u n d s u n d e r s t u d y . It is o b v i o u s that there are cases w h e r e groups w i l l interact w i t h e a c h other a n d h e n c e cannot h a v e o n l y a d d i t i v e effects. (3)

Fujita and B a n

h a v e r e p o r t e d a successful a t t e m p t to i n c l u d e possible interactions

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

i n a F r e e - W i l s o n t y p e of analysis. I n a s t u d y of a series of substrates for d o p a m i n e - / ? - h y d o x y l a s e , a series of d o p a m i n e analogs w a s s t u d i e d . I n a d d i t i o n to the u s u a l F r e e - W i l s o n m a t r i x a t e r m was a d d e d to a l l o w for the i n t e r a c t i o n w h i c h m i g h t b e possible f r o m h a v i n g t w o h y d r o x y l groups p l a c e d ortho to e a c h other or f r o m a h y d r o x y l g r o u p ortho to a m e t h o x y l g r o u p . T h e s e terms w e r e a d d e d to t h e r e g u l a r m a t r i x as a h y p o t h e t i c a l n e w p o s i t i o n ; the statistical parameters w e r e t h e n c o m p a r e d w i t h those for the c o n v e n t i o n a l r u n . T h e a d d i t i o n of t h e t e r m expressing the o r t h o r e l a t i o n s h i p of the m e t h o x y l a n d h y d r o x y l groups d i d give a significant i m p r o v e m e n t to the c o r r e l a t i o n w h e r e a s the t e r m expressing the ortho r e l a t i o n s h i p of t w o h y d r o x y l groups d i d not i m p r o v e the r e g u l a r c o r r e l a t i o n . T h e v a l u e for the i n t e r a c t i o n t e r m h a d a n e g a t i v e coefficient, s h o w i n g that this i n t e r a c t i o n r e s u l t e d i n a r e d u c t i o n of t h e b i o l o g i c a l activity. Overlaps Between the Two Methods C a m m a r a t a c o m p a r e d the F r e e - W i l s o n constants d e r i v e d for a series of t e t r a c y c l i n e analogs w i t h some p h y s i c a l constants, a n d f o u n d a r e l a t i o n s h i p w h i c h i n v o l v e d t w o parameters ( 4 ). I n as yet u n p u b l i s h e d w o r k on antimalarial compounds exist b e t w e e n

I h a v e f o u n d g o o d l i n e a r r e l a t i o n s h i p s to

c e r t a i n F r e e - W i l s o n substituent constants

(S.C. )

and

H a n s e n ' s " p i " or H a m m e t t ' s " s i g m a " constants for the same substituents. T h e s e r e l a t i o n s h i p s are s h o w n i n T a b l e I.

Table I. Group CF Br CI H F OCH

Relationships among Parameters

R* S.C. 0.332 0.223 0.0688 -0.257 -0.265

3

3

-

Pi 1.16 0.86 0.71 0 0.14 -0.02

Sigma

para)

0.54 0.23 0.23 0 0.06 -0.27

S.C. 0.476 0.388 -0.118 -0.431 -0.159 -0.510

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

120

BIOLOGICAL CORRELATIONS

T H E HANSCH APPROACH

|+Si +Pi|

- r - 1.0 Sigma

CF3SO2

_.75· Νθ

SQ NH 2

2

• CN

2

• CH S0 3

CF

SF

5

3

2

^.50 CONH

CH3CO

2

\COOCHj OCF3

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

COOH

-2.0

-1.6

-1.2

L CI

.4

-.4

-.8

I

Br

1.2

CH3CONH

1.6

SCH,

.-.25

CH

t- Butyl

3

OH J--.50

NMe

2

+ -75

-Si -Pi

Journal of Medicinal Chemistry

Figure 1. Two-dimensional plot of pi vs. sigma constants for aromatic substituents. The constants for the particular groups listed in Table I lie essentially on a straight line. T h e substituent constants at R a n d R correlate q u i t e w e l l w i t h b o t h 3

6

the p i a n d s i g m a ( p a r a ) constants as s h o w n b y c o r r e l a t i o n coefficients o f f r o m 0.89 t o 0.99. I n h i n d s i g h t , o n e w o u l d expect that t h e H a n s c h m e t h o d s h o u l d g i v e g o o d results u s i n g either p i o r s i g m a constants as parameters.

A c t u a l l y , H a n s c h a n d I h a d p r e v i o u s l y o b t a i n e d s u c h cor­

relations f r o m t h e same set o f d a t a , a n d these results represent b e t w e e n t h e t w o different methods.

agreement

However, w e were unable to decide

b e t w e e n p a r t i t i o n o r p o l a r factors as either o n e alone gave g o o d corre­ lations.

T h e p r o b l e m w a s f o u n d to reside i n t h e p a r t i c u l a r c h o i c e o f

substituent groups s t u d i e d ; a t w o - d i m e n s i o n a l p l o t o f p i vs. s i g m a c o n ­ stants f o r a r o m a t i c substituents shows that t h e p a r t i c u l a r groups s t u d i e d (see a b o v e l i s t ) l i e essentially o n a straight l i n e . T h e r e f o r e t h e y cannot l e a d t o a c h o i c e b e t w e e n t h e m . T h i s t w o - d i m e n s i o n a l p l o t is r e p r o d u c e d i n F i g u r e 1; a m o r e c o m p l e t e discussion o f this " c o v a r i a n c e " p r o b l e m has been published ( 5 ) . W h e n w h i c h o f these methods t o a p p l y t o a g i v e n set of d a t a is considered, the choice w i l l usually depend u p o n the number a n d type

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

8.

CRAIG

Comparison

of Hansch

and Free-Wilson

121

Approaches

of analogs w h i c h h a v e b e e n p r e p a r e d . O n e s h o u l d h a v e d a t a for at least five m o r e c o m p o u n d s

t h a n t h e m i n i m u m r e q u i r e d for s o l u t i o n of

the

F r e e - W i l s o n m a t r i x . I n a d d i t i o n , one s h o u l d h a v e t w o or m o r e examples for e a c h g r o u p at e v e r y p o s i t i o n , i f possible, to increase the

confidence

w i t h w h i c h one c a n a p p l y the results. T o a p p l y the F r e e - W i l s o n m e t h o d , one m u s t have a series of closely r e l a t e d structures whereas the H a n s c h m e t h o d m a y be a p p l i e d to series of c o m p o u n d s w i t h q u i t e different structure, p r o v i d e d one has d a t a for one or m o r e p h y s i c a l parameters for a l l of the c o m p o u n d s

i n question.

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

W h e n one has o n l y 8 - 1 2 c o m p o u n d s , o n l y t h e H a n s c h m e t h o d m a y b e used. F r e e i n t e n d e d this a p p r o a c h to be u s e d as a g u i d e to p r o p e r e x p e r i m e n t a l design for the chemist, i n p l a n n i n g his c h o i c e of analogs

for

p r e p a r a t i o n . B y a d v a n c e p l a n n i n g one c a n a v o i d the s i t u a t i o n d e s c r i b e d a b o v e w h e r e one has o n l y one example of a substituent at a p a r t i c u l a r position.

Detailed Discussion of the Free-Wilson

Additivity

Model

I n 1964, F r e e a n d W i l s o n i n t r o d u c e d the m e t h o d for structure—act i v i t y c o r r e l a t i o n w h i c h is b a s e d u p o n the a s s u m p t i o n that e a c h substituent g r o u p i n a m o l e c u l e at a specific p o s i t i o n makes a constant a d d i t i v e c o n t r i b u t i o n t o w a r d s the o v e r a l l b i o l o g i c a l a c t i v i t y of the m o l e c u l e .

One

sets u p a series of equations, one p e r c o m p o u n d , w h i c h m a t h e m a t i c a l l y expresses this concept.

S o l u t i o n of these m u l t i p l e equations i n m u l t i p l e

u n k n o w n s is o b t a i n e d b y the m e t h o d of least squares, u s i n g c o m p u t e r t e c h n i q u e s , a n d the r e s u l t i n g statistical parameters a l l o w one to j u d g e w h e t h e r or not the o r i g i n a l a s s u m p t i o n of a d d i t i v i t y was v a l i d for the p a r t i c u l a r set of b i o l o g i c a l d a t a s t u d i e d . If a d d i t i v i t y is c o n f i r m e d , the m o d e l m a y be u s e d to p r e d i c t a p p r o x i m a t e b i o a c t i v i t i e s for those

com-

b i n a t i o n s of substituent groups w h i c h h a v e not b e e n p r e p a r e d . T h e ranges of substituent values at each p o s i t i o n h e l p i d e n t i f y those positions i n the m o l e c u l e w h i c h are most sensitive to change i n substituent; these are the positions w h e r e f a v o r a b l e groups m a y be expected to i n crease a c t i v i t y a p p r e c i a b l y .

I n a d d i t i o n , the r e l a t i v e activities of

substituent groups at a p a r t i c u l a r p o s i t i o n often suggest

the

relationships

( s u c h as w i t h p i a n d s i g m a a b o v e ) w h i c h c a n l e a d to extrapolations to suggest groups not o r i g i n a l l y s t u d i e d w h i c h m a y be w o r t h p r e p a r i n g . A d d i t i v i t y — T h e Basic Premise.

T h e c o n c e p t of a d d i t i v i t y of s u b -

stituent g r o u p c o n t r i b u t i o n s is m e r e l y a n expression of the m e d i c i n a l chemist's i n t u i t i o n w h i c h has so successfully l e d to the d e v e l o p m e n t

of

u s e f u l t h e r a p e u t i c agents i n the past 70 years. H o w e v e r , i t is s u c h a b a s i c

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

122

BIOLOGICAL

CORRELATIONS

T H E

HANSCH APPROACH

c o n c e p t that the c h e m i s t m u s t q u e s t i o n i t a n d m u s t consider its i m p l i cations. T h e r e are c e r t a i n l y cases w h e r e synergistic effects are seen ( 3 ) , a n d these are not c o m p a t i b l e w i t h the c o n c e p t of a d d i t i v i t y . M a n y successful studies h a v e c o n f i r m e d the g e n e r a l c o n c e p t , a n d the m e t h o d has a b u i l t - i n c h e c k i n t h a t the statistical parameters ( F v a l u e , R , s t a n d a r d d e v i a t i o n ) 2

a l l o w one to c h e c k the v a l i d i t y of t h e o r i g i n a l a s s u m p t i o n w i t h r e g a r d to the a c t u a l set of b i o l o g i c a l d a t a u n d e r c o n s i d e r a t i o n ( 6 ) .

Therefore,

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

one is not w i t h o u t g u i d e l i n e s i n u s i n g this t e c h n i q u e , a n d the process c a n b e h e l p f u l even i f a d d i t i v i t y is not c o n f i r m e d , as that i n itself c a n p r o v i d e u s e f u l i n f o r m a t i o n . A major c h e c k of the analysis is p r o v i d e d w h e n the activities of e a c h of the c o m p o u n d s are c a l c u l a t e d u s i n g the d e r i v e d substituent constants. T h e deviations b e t w e e n the o b s e r v e d a n d c a l c u l a t e d b i o a c t i v i t i e s c a n p o i n t out p a r t i c u l a r c o m p o u n d s w h i c h are p o o r l y c a l c u l a t e d . I n a n as yet u n p u b l i s h e d case the large d e v i a t i o n for one c o m p o u n d w a s u s e d to q u e s t i o n the structure, a n d i n d e e d , a rearrangement h a d occurred

d u r i n g its p r e p a r a t i o n w h i c h r e s u l t e d i n a n

i n c o r r e c t assignment of its structure.

T h e m o r e u s u a l cause of

large

d e v i a t i o n s is i n the b i o l o g i c a l test results as these are often difficult to quantitate reproducibly.

O f course, a large d e v i a t i o n m a y also

point

t o w a r d s other causes of n o n - a d d i t i v i t y s u c h as c h e l a t i o n effects, h y d r o g e n b o n d i n g , or s p e c i a l steric considerations. T h e g e n e r a l p r o b l e m of l a c k of accurate r e p r o d u c i b i l i t y of the b i o l o g i c a l test d a t a u s u a l l y results i n a n u n a v o i d a b l e s t a n d a r d d e v i a t i o n of a b o u t 0.20 to 0.25 l o g 1 / C units. T h u s " a d d i t i v i t y " r e a l l y means that s l i g h t v a r i a t i o n s of this m a g n i t u d e f r o m strict a d d i t i v i t y c o u l d not

be

detected. It s h o u l d be p o i n t e d out here that the H a n s c h m e t h o d , too, assumes that each substituent p l a y s a constant a n d a d d i t i v e r o l e f r o m

compound

to c o m p o u n d , a n d i t , too, is l i m i t e d b y the almost i r r e d u c i b l e s t a n d a r d d e v i a t i o n of about 0.2 or 0.25 l o g 1 / C units ( 8 ) .

I n a recently published

p a p e r , C a m m a r a t a treats the relationships a n d assumptions i n v o l v e d i n the H a n s c h a n d F r e e - W i l s o n methods f r o m a systematic p o i n t of v i e w ( 8 ). Procedure. T h e F r e e - W i l s o n m e t h o d is most u s e f u l w h e n three or m o r e positions of a m o l e c u l e are subjected to v a r i a t i o n ; a l t h o u g h

one

c a n a p p l y the m e t h o d to cases w h e r e o n l y t w o positions a r e i n v o l v e d , s i m p l e i n t u i t i o n c a n do a b o u t as w e l l i n s u c h s i m p l e cases. A s the c o m p l e x i t y of the s t r u c t u r a l changes increases, this m e t h o d becomes m o r e a n d m o r e v a l u a b l e , a n d i n v e r y c o m p l i c a t e d systems w i t h substituents at m a n y positions, i t c a n be extremely h e l p f u l . T h e f o l l o w i n g treatment w i l l i l l u s trate the m e t h o d i n g e n e r a l terms. T h e f o l l o w i n g generic f o r m u l a r e p r e -

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

8.

CRAIG

Comparison

of Hansch

and Free-Wilson

123

Approaches

sents a f a m i l y of c o m p o u n d s , a l l h a v i n g a c o m m o n structure X w h i c h is s u b s t i t u t e d at R i , R , a n d R : 2

3

R

:

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

L e t us assume that there are 15 c o m p o u n d s

i n this series; at R i

have f o u r different groups, at R , five groups, a n d at R , 2

five

3

we

groups.

C o m p a r a t i v e b i o l o g i c a l d a t a are a v a i l a b l e for a l l fifteen c o m p o u n d s .

The

b i o l o g i c a l d a t a m a y be expressed i n q u a n t a l units {e.g., 0,1,2,3,4), or i n activities r e l a t i n g t h e m to a s t a n d a r d agent, or most c o n v e n i e n t l y , i n terms of l o g 1 / C values w h e r e C is the m o l a r c o n c e n t r a t i o n of test c o m ­ p o u n d w h i c h causes a s t a n d a r d effect. I n the case of w h o l e a n i m a l s t u d ­ ies, C is u s u a l l y expressed as m o l e s / k g test a n i m a l . B y use of l o g

1/C

p o s i t i v e increases i n v a l u e m e a n increased b i o l o g i c a l a c t i v i t y . A struc­ t u r a l m a t r i x is a s s e m b l e d i n T a b l e I I . Table II.

Structural Matrix

Compound Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

R Log 1/C

R2

1

A

Β

C

D

Ε

F

G H

I

J

1 1 1 0 0 0 0 1 0 0 0 0 0 0 0

0 0 0 1 1 1 1 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 1 1 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 1 1 1 1

1 0 0 0 0 0 1 0 1 0 0 0 0 0 0

0 1 0 0 0 0 0 1 0 1 0 0 0 0 0

0 0 1 0 0 0 0 0 0 0 1 1 0 0 0

0 0 0 1 1 0 0 0 0 0 0 0 0 0 1

0 0 0 0 0 1 0 0 0 0 0 0 1 1 0

1 1 0 0 0 0 0 0 0 1 0 0 0 0 0

0 0 1 1 0 0 0 0 0 0 0 1 0 0 0

0 0 0 0 1 1 0 0 0 0 0 0 0 0 1

0 0 0 0 0 0 1 0 0 0 1 0 0 1 0

0 0 0 0 0 0 0 1 1 0 0 0 1 0 0

4

4

3

4

3

3

3

3

3

3

3

3

3

3

1.02 3.01 2.53 3.21 3.10 2.89 1.98 2.45 3.05 2.70 2.40 2.78 2.98 1.80 3.15

N u m b e r of examples

κ

L

M Ν

C o m p o u n d 1 has g r o u p A at p o s i t i o n R i , g r o u p Ε at p o s i t i o n R , 2

a n d g r o u p J at p o s i t i o n R . 3

of e a c h c o m p o u n d .

T h u s the m a t r i x defines the exact structure

T h e o n l y v a r i a b l e n u m b e r s are the log 1 / C values,

w h i c h are e x p e r i m e n t a l l y d e t e r m i n e d a n d w h o s e v a r i a b i l i t y is t h e reason w h y so m a n y c o m p o u n d s are r e q u i r e d i n excess of the t h e o r e t i c a l n u m b e r

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

124

BIOLOGICAL CORRELATIONS

T H E HANSCH A P P R O A C H

r e q u i r e d to a c h i e v e a solution. N o assumptions are m a d e about the p h y s ­ i c a l parameters w h i c h m a y b e i n v o l v e d , a n d o n l y the exact s t r u c t u r a l d a t a a n d the b i o l o g i c a l test d a t a are u s e d to c a r r y o u t this analysis. T h i s m a t r i x serves as i n p u t to a c o m p u t e r for s o l u t i o n of the f o l l o w i n g e q u a t i o n b y least squares: Log 1/C =

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

+

G

+

H

+

[L

+

I

A + +

J

B +

+

K

C + +

L

D

+

+

M

E +

+

F

N

I n this e q u a t i o n , μ is the o v e r - a l l average l o g 1 / C v a l u e for t h e 15 c o m ­ p o u n d s , a n d A t h r o u g h Ν are n u m e r i c a l coefficients, p o s i t i v e a n d n e g a ­ tive, w h i c h represent t h e c o n t r i b u t i o n s of e a c h g r o u p ( A t h r o u g h Ν ) to the b i o l o g i c a l a c t i v i t y ; these constants

novo substituent constants,

(de

v a l i d o n l y for this set of d a t a ) represent t h e g r o u p c o n t r i b u t i o n s to t h e over-all biological activity. The

basic

a s s u m p t i o n of a d d i t i v i t y d e m a n d s

that the f o l l o w i n g

relationships h o l d f o r this set of d a t a : 4 A + 4 £ + 3 C + 4Z) = 0 ; or A SE

+

=

- B

3F + 3 C + 3 # + 3 / = 0 ; or E=

3 J + ZK

+

3 L + SM

- 3 / 4 C —D - F

+ 3N = 0 ; or J =

- G

- K

(at R i )

- H - L

- I

- M

(at R ) 2

- N

(for R ) 3

T h e s e are c a l l e d t h e restrictive equations; b e c a u s e of these r e l a t i o n ­ ships b e t w e e n t h e v a r i a b l e s there are r e a l l y o n l y three u n k n o w n terms at R i a n d four e a c h at R a n d R . H e n c e there are 3 + 4 + 4 = 2

3

11 u n ­

k n o w n terms to b e solved, p l u s o n e m o r e t e r m for the o v e r - a l l average, μ.

T h e r e m u s t b e a m i n i m u m of 12 c o m p o u n d s

to p e r m i t a s o l u t i o n .

H o w e v e r , since t h e b i o l o g i c a l test results are the d e p e n d e n t v a r i a b l e s , a n d e a c h of t h e b i o l o g i c a l test results has a degree of u n c e r t a i n t y , or v a r i a b i l i t y , to its v a l u e , a n u m b e r of a d d i t i o n a l c o m p o u n d s

is r e q u i r e d

to g i v e a g o o d degree of assurance for the results, i.e., to increase the statistical significance of the d e r i v e d values for each t e r m (these are the so-called de novo substituent c o n s t a n t s ) .

I t is desirable to h a v e at

least five a n d , p r e f e r a b l y , 10 or m o r e c o m p o u n d s

i n excess of t h e m i n i ­

m u m r e q u i r e d f o r solution. O u r h y p o t h e t i c a l sample case has o n l y three m o r e c o m p o u n d s t h a n the 12 r e q u i r e d a n d so c o u l d n o t be expected to g i v e significant results. T h e use of a s p e c i a l m u l t i p l e regression analysis c o m p u t e r p r o g r a m w h i c h c a n i n c l u d e the r e s t r i c t i v e equations p r o p e r l y leads d i r e c t l y to the p r o p e r solution. T o use a s t a n d a r d m u l t i p l e regression analysis p r o g r a m one must i n c o r p o r a t e t h e restrictive c o n d i t i o n s as f o l l o w s . A =

R e c a l l i n g that

—B — 3 / 4 C — D, t h e e n t i r e c o l u m n f o r t h e A t e r m is r e m o v e d f r o m

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

8.

Comparison

CRAIG

of Hansch

and Free-Wilson

125

Approaches

the m a t r i x ; s i m i l a r l y , the c o l u m n s for Ε a n d / are r e m o v e d .

N o w w e are

left w i t h a m a t r i x w h i c h contains 3, 4, a n d 4 v a r i a b l e s at R i , R , a n d R , 2

3

respectively. W h e n i n t r o d u c i n g the expression for the c o m p o u n d s w h i c h c o n t a i n e d the terms A, E, or / , one introduces the e q u i v a l e n t values i n the same w a y as i l l u s t r a t e d i n T a b l e I I I . Table III.

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

Compound Number 1 2 3 4

Log 1/C 1.02 3.01 2.53 3.21

B

C

- 1 -0.75 - 1 -0.75 - 1 -0.75 0 1

Contracted Matrix D

F

G

H

I

K

L

M

N

- 1 - 1 - 1 - 1 - 1 - 1 - 1 - 1 - 1 -1 0 1 0 0 - 1 - 1 - 1 - 1 -1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0

S o l u t i o n of the o r i g i n a l m a t r i x b y a s t a n d a r d p r o g r a m w o u l d i n c o r r e c t statistical parameters i f the restrictive equations w e r e

give

entered

d i r e c t l y i n t o the m a t r i x . S o l u t i o n of the c o n t r a c t e d m a t r i x i n T a b l e I I I b y a s t a n d a r d p r o g r a m gives exactly the same results as are o b t a i n e d f r o m the o r i g i n a l m a t r i x b y the s p e c i a l p r o g r a m of F r e e a n d co-workers. H o w e v e r , the substituent constants for A , E , a n d / m u s t b e c a l c u l a t e d b y use of the restrictive equations w h e n the c o n t r a c t e d m a t r i x is used. T h e least squares s o l u t i o n gives the values of μ a n d the substituent constants A t h r o u g h N. I n a d d i t i o n , the F v a l u e for the o v e r - a l l regression is c a l c u l a t e d , as is the c o r r e l a t i o n coefficient, R.

The term R

(the v a r i ­

2

ance ) is expressed b y the ( m o d e l s u m of squares ) / ( t o t a l s u m of squares ) , a n d w h e n this v a l u e is 8 0 % or s o m e w h a t greater, the o r i g i n a l a s s u m p ­ t i o n of a d d i t i v i t y of substituent g r o u p effects is c o n s i d e r e d to b e ported.

T h e r e m a i n i n g v a r i a b i l i t y of 1 0 - 2 0 %

sup­

is the r e s i d u a l v a r i a t i o n

d u e to b i o l o g i c a l test v a r i a b i l i t y a n d is almost inescapable.

The F value

offers a n a d d i t i o n a l c h e c k of the significance of the results; this v a l u e c a n be c o m p a r e d w i t h the d e c i s i o n statistic " F " v a l u e f r o m tables to test the v a l i d i t y of u s i n g the d e r i v e d substituent constants to c a l c u l a t e the b i o l o g i c a l activities of the c o m p o u n d s u s e d i n t h e F r e e - W i l s o n analysis. If the F test fails, i t means that one cannot e x p l a i n the different b i o l o g i c a l activities of the c o m p o u n d s b y use of the a d d i t i v i t y e q u a t i o n ; i n s u c h a case one is as w e l l off u s i n g μ as the c a l c u l a t e d a c t i v i t y — a sad state of affairs i n d e e d . I n o u r h y p o t h e t i c a l e x a m p l e , the F v a l u e is the one for F , n - T h e s e 3

subscripts d e r i v e f r o m the analysis of v a r i a n c e for the regression, w h e r e 3-f4

4 =

l l degrees of f r e e d o m are a t t r i b u t a b l e to the regression,

a n d 15 — 1 =

+

14 represents the t o t a l degrees of f r e e d o m i n the m o d e l .

T h e n 14 — 11 =

3 degrees of f r e e d o m w h i c h are a t t r i b u t a b l e to the error

t e r m . If the o b s e r v e d F v a l u e exceed the t a b u l a r v a l u e of 8.76 ( F , 3

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

u

at

126

BIOLOGICAL CORRELATIONS

the 5 %

l e v e l ) or 26.13 ( F

3 ) 1 1

at the 1 %

T H E HANSCH A P P R O A C H

l e v e l ) , the d e r i v e d s o l u t i o n is

significant at or a b o v e those levels. I n p r e p a r i n g the m a t r i x for a F r e e - W i l s o n analysis, one m u s t

be

c a r e f u l to a v o i d situations w h i c h l e a d to singularities a n d hence cannot g i v e a u n i q u e solution.

The problem

to be

avoided

is i l l u s t r a t e d i n

T a b l e I V i n its simplest f o r m .

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

Table IV.

Situations Leading to

Compound Number

Log 1/C

A

1 2 3 4 5 6

1.2 1.4 1.6 0.9 0.6 2.1

1 0 0 0 0 0

0

Total

1

3

Singularities

— B

C

D

0 0 1 0 1 0

1 0 0 0 0 0

0

1 0 1 0 1

2

1

3

^ E

— F

1 1 0 1 0

0 0 0 1 0 1 2

C o m p o u n d 1 i n the m a t r i x represents the u n i q u e o c c u r r e n c e of t w o substituents i n one c o m p o u n d . T h i s is r e a d i l y seen as a v i o l a t i o n of the m e d i c i n a l chemist's c a r d i n a l r u l e of m a k i n g o n l y one change at a t i m e . It is no m o r e possible for the c o m p u t e r to assign values to A a n d D t h a n it w o u l d be for a chemist w h o has c h a n g e d t w o groups at once to ascribe the r e s u l t i n g change i n b i o l o g i c a l a c t i v i t y to either one of the groups w h e n h e has no other examples of the effects of either group. T o detect s u c h p r o b l e m s i n a d v a n c e of the c o m p u t e r r u n one s h o u l d a l w a y s p r e p a r e the f u l l structure m a t r i x , a n d each g r o u p w h i c h b u t once s h o u l d be c h e c k e d

appears

to be sure that the p a r t i c u l a r c o m p o u n d

b e a r i n g that substituent has no other substituent w h i c h occurs o n l y i n that

compound.

More

complex

singularities, w h i c h

have

m a t h e m a t i c a l basis, are s h o w n i n T a b l e s V a n d V I ; these are less often, b u t s h o u l d be c h e c k e d

the

same

encountered

for b y c a r e f u l s t u d y of the o r i g i n a l

matrix. I n the latter case, a l t h o u g h there are three examples of A a n d D , they o c c u r i n exactly the same set of c o m p o u n d s , a n d this is e q u i v a l e n t to t h e other cases w h i c h l e a d to singularities. T h i s explains w h y

H u d s o n , Bass, a n d P u r c e l l ( 7 )

obtained

two

different solutions to a m a t r i x b y m a k i n g different substitutions u s i n g the restrictive equations.

T h e i r presentation of the m a t r i x used is not i n

its most e x p a n d e d f o r m b u t is c o n t r a c t e d b y a p p l i c a t i o n of the restrict i v e equations; this makes it difficult to v i s u a l i z e the s i n g u l a r i t y p r o b lems.

W h e n t h e i r m a t r i x is r e w r i t t e n i n the c o m p l e t e f o r m , it becomes

r e a d i l y a p p a r e n t that t h e i r C o m p o u n d s 3 a n d 4 are i n v o l v e d i n a s i n g u -

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

8.

Comparison

CRAIG

of Hansch

Table V . Compound Number 1

α

2

A

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

3 4 5 6

α

and Free-Wilson

Complex

Log 1/C

A

1.2 0.7 0.9 0.8 1.1 0.3

1 1 0 0 0 0

Total

2

127

Approaches

Singularities

^1 B

C

D

1 1 0 0

0 0 0 0 1 1

1 1 0 0 0 0

2

2

2

0 0

?1 E

F

0 1 0 1

0 0 1 0 1 0

2

2

0 0

Cases leading to singularities.

l a r i t y of t h e t y p e i l l u s t r a t e d i n T a b l e I V ; t h e i r C o m p o u n d s 5 a n d 6 h a v e s i n g u l a r i t y p r o b l e m s of the t y p e i l l u s t r a t e d i n T a b l e I. R e m o v a l of these four c o m p o u n d s f r o m the m a t r i x leads to a m o r e s i m p l e m a t r i x w i t h n o s u c h p r o b l e m s , a n d i d e n t i c a l substituent constants result regardless

of

h o w the r e s t r i c t i v e equations are u s e d — e . g . , a u n i q u e s o l u t i o n is o b t a i n e d . Use

of the s p e c i a l p r o g r a m d e v e l o p e d

b y Free, w h i c h w i l l accept

the entire m a t r i x , avoids this p r o b l e m as the existence of a s i n g u l a r i t y prevents

any solution from being

obtained.

U n f o r t u n a t e l y , use of

a

s t a n d a r d regression p r o g r a m w i t h a p p l i c a t i o n of the restrictive c o n d i ­ tions, as a l r e a d y discussed, c a n force a s o l u t i o n w h i c h is one of a f a m i l y of possible solutions; these are not u n i q u e , a n d cannot be r e l i e d u p o n . T h e t e c h n i q u e of s t u d y i n g the o r i g i n a l m a t r i x i n its entirety, to a v o i d singularities, w i l l p i n p o i n t these p r o b l e m s .

E l i m i n a t i o n of one o r m o r e

c o m p o u n d s f r o m the m a t r i x w i l l correct the p r o b l e m . If h y d r o g e n is c o n s i d e r e d as one of the substituents at R i , the result­ i n g v a l u e of μ ( t h e average b i o l o g i c a l a c t i v i t y ) m u s t b e c o n s i d e r e d to b e the b i o l o g i c a l a c t i v i t y of a h y p o t h e t i c a l c o m p o u n d w h e r e R i is a n o n ­ entity. I n this case, the substituent constant for h y d r o g e n m u s t b e a d d e d for R i to o b t a i n t h e v a l u e of the ' u n s u b s t i t u t e d m o l e c u l e " w h e r e R i = Table V I . Compound Number 1* 2

A

3 5 6 α

Complex

Singularities

Ri Log

1/C

0.9 0.7 1.3 1.2 1.1 0.3 Total

i?2

A

Β

C

D

Ε

F

1 1 0 1 0 0 3

0 0 1 0 1 0 2

0 0 0 0 0 1 1

1 1 0 1 0 0 3

0 0 1 0 0 1 2

0 0 0 0 1 0 1

Cases leading to singularities.

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

H.

128

BIOLOGICAL CORRELATIONS

T H E HANSCH A P P R O A C H

T o a v o i d this, a n d to p l a c e the r e s u l t i n g substituent constants o n a scale r e l a t i v e to h y d r o g e n e q u a l to zero, C a m m a r a t a (4)

u s e d the t e c h ­

n i q u e of setting the substituent constant for h y d r o g e n as zero b y

not

i n c l u d i n g h y d r o g e n as one of t h e groups i n the matrix—e.g., i n T a b l e V , n e i t h e r A , B , C , n o r D , E , or F is h y d r o g e n . where R i and R

2

I n this case, the

compound

are h y d r o g e n w o u l d b e e n t e r e d as f o l l o w s : l o g 1 / C

=

000000. A n a d d i t i o n a l a d v a n t a g e of this a p p r o a c h is that n o w one n e e d n o t a d d t h e r e s t r i c t i v e e q u a t i o n since one v a r i a b l e ( H )

has b e e n

removed

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

f r o m the m a t r i x for e a c h p o s i t i o n . T h u s a s t a n d a r d m u l t i p l e regression p r o g r a m c a n n o w b e d i r e c t l y e m p l o y e d , t a k i n g care to a v o i d the s i n g u ­ l a r i t y p r o b l e m , of course.

S h o u l d h y d r o g e n not o c c u r as a substituent

at one of the positions, one m a y a r b i t r a r i l y set another substituent g r o u p at zero as w a s i l l u s t r a t e d a b o v e for h y d r o g e n . Predictive Use of Free-Wilson Substituent Constants.

The newly

f o u n d substituent constants are l i n e d u p i n decreasing order of a c t i v i t y for e a c h p o s i t i o n . A p r e d i c t i o n m a y b e m a d e for a l l possible

compounds

a r i s i n g f r o m c h e m i c a l l y a l l o w a b l e c o m b i n a t i o n s of one g r o u p at each position.

I n these p r e d i c t i o n s i t m u s t be r e m e m b e r e d

that

constants

w h i c h w e r e o b t a i n e d for groups w h i c h o c c u r r e d o n l y once or t w i c e i n the m a t r i x u s u a l l y h a v e a l o w degree of significance.

H e r e one s h o u l d

b e g u i d e d b y reference to a Τ test v a l u e to establish the significance of the difference b e t w e e n the substituent constants at each p o s i t i o n

(6).

O f course, i t is possible to b u i l d the c a l c u l a t i o n s for a l l possible

com­

b i n a t i o n s of the substituents at each p o s i t i o n i n t o a c o m p u t e r p r o g r a m , a n d this results i n a l i s t i n g of p r e d i c t e d b i o l o g i c a l activities for a l l possible compounds. S u c h p r e d i c t i o n s m u s t be u s e d w i t h c a u t i o n b u t c a n be g o o d guides w h e n the statistical parameters are f a v o r a b l e .

O n e m u s t a v o i d the p r e ­

d i c t i o n of extreme a c t i v i t y for u n u s u a l l y h i g h l y s u b s t i t u t e d analogs, w h i c h m i g h t be almost i m p o s s i b l e to p r e p a r e . F o r the m o d e l i l l u s t r a t e d i n T a b l e I I , 4 X 5 X 5 or 100 analogs are e x e m p l i f i e d ; f r o m a s t u d y of f r o m 20 to 25 c o m p o u n d s

w e s h o u l d have

a g o o d i d e a of t h e activities to be expected for the other 75 to 80 analogs. T h e h i g h e r the s t a n d a r d d e v i a t i o n for t h e analysis, the l o w e r the a c c u r a c y of the p r e d i c t e d values w i l l be.

F o r t y p i c a l b i o l o g i c a l tests, v a r i a t i o n of

f r o m one-half to t w i c e the o b s e r v e d a c t i v i t y is about

average.

E x t r a p o l a t i o n s b e y o n d the confines of the exact substituent groups s t u d i e d m a y b e m a d e if, after a s t u d y of the list of substituent g r o u p values at each p o s i t i o n , correlations w i t h p h y s i c a l constants are n o t i c e d . S u c h correlations m a y be o b s e r v e d b y g r a p h i c a l p r o c e d u r e s , or b y the use of regression analysis. T h e use of n e w groups not yet s t u d i e d c a n n o w be suggested b y reference to tables of p h y s i c a l constants.

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.

Extra-

8.

CRAIG

Comparison

of Hansch

and Free-Wilson

Approaches

129

p o l a t i o n b e y o n d t h e l i m i t s of t h e range of values f o r those g r o u p s a l r e a d y studied should b e carefully considered, a n d the resulting predictions s h o u l d n o t b e e x p e c t e d t o b e v e r y accurate.

H o w e v e r , b y this m e t h o d

a F r e e - W i l s o n analysis c a n l e a d to suggestions f o r n e w c o m p o u n d s i n a m a n n e r s i m i l a r t o use o f t h e H a n s c h m e t h o d .

Downloaded by UNIV OF CINCINNATI on February 18, 2015 | http://pubs.acs.org Publication Date: August 1, 1974 | doi: 10.1021/ba-1972-0114.ch008

Literature Cited 1. 2. 3. 4. 5. 6.

Hansch,C.,Accounts Chem. Res. (1969) 2, 232. Free, S. M., Jr., Wilson, J. W., J. Med. Chem. (1964) 7, 395. Fujita, T., Ban, T., J. Med. Chem. (1971) 14, 148. Cammarata, Α., Yau, S. J., J. Med. Chem. (1970) 13, 93. Craig, P. N., J. Med. Chem. (1971) 14, 680. Snedecor, G. W., "Statistical Methods," Iowa State University Press, Ames, Iowa, 1966. 7. Hudson, D. R., Bass, G. E., Purcell, W. P., J. Med. Chem. (1970) 13, 1184. 8. Cammarata, Α., J. Med. Chem. (1972) 15, 573. RECEIVED June 17, 1971. Work done at Smith Kline and French Laboratories, Philadelphia, Pa.

In Biological Correlations—The Hansch Approach; Van Valkenburg, W.; Advances in Chemistry; American Chemical Society: Washington, DC, 1974.