Quantitative structure-activity relationships. 2. A mixed approach

A rational explanation is given for the use of dummy variables in Hansch .... Dummy Variables (Indicator Variables) Used in Hansch and Free-Wilson Ana...
0 downloads 0 Views 2MB Size
Journal of Medicinal Chemistry, 1976, Vol. 19, No.5 587

Quantitative S A Relationships. 2

Quantitative Structure-Activity Relationships. 2. A Mixed Approach, Based on Hansch and Free-Wilson Analysis Hugo Kubinyi Research Znstitute of Knoll AG, Chemische Fabriken, 0-6700 LudwigshafenlRhein, Germany. Received June 27,1975 Based on the theoretical and numerical equivalence of Hansch’s linear multiple regression model and the modified Free-Wilson model a mixed approach is developed. The mixed approach is a combination of both models which makes use of the advantages of each model and widens the applicability of Hansch and Free-Wilson analysis. The Free-Wilson approach now is applicable also in the case of parabolic dependence of biological activity on a particular physical property, e.g., log P or r. A rational explanation is given for the use of dummy variables in Hansch equations and the derivation of Hansch correlations for de novo group contributions obtained from FreeWilson analysis. Some examples illustrate the mixed approach and demonstrate its usefulness to establish biologically meaningful structure-activity relationships.

The nonparabolic form of Hansch’s linear multiple regression modell-3 (e.g., eq 1or, more generally expressed, log 1 / C = k i n

+ k z +~ kBE, + k4

(1)

eq 2) and a modified form415of Free-Wilson’s additive

+

log 1/C = C kj#~j k I

$j

(2)

= structural parameters like II, u, or E ,

model6 (eq 3) log

log

1/c = &ai I + /A

(3)

where ai = group contribution of substituent Xi, based on UH = 0.00 and p = log 1/C calcd of the unsubstituted compound5 have been shown to be theoretically interrelated738 and numerically equivalents (a detailed description of this modified Free-Wilson model is given by Fujita and Ban;5 note that Cammarata and Yau4 defined p = log 1/C obsd of the unsubstituted compound, which gives different results; compare ref 8). This equivalence is due to the fact that the individual group contributions can be interpreted as a weighted s u m of several physical properties 4j of the substituents (eq 4).

(4) Substitution of eq 4 into the Free-Wilson type eq 3 gives, after appropriate transformation,a eq 5, which is a Hansch type equation. log 1 / C = b l C G 1

+ b2Z$2 +

* *

+ bnZ$,, + 1-1

(5)

Biological activity, expressed as log 1/C, often is parabolically dependent on lipophilic character, expressed as log P or ~ . ~ - 3 1In9 these cases the linear multiple regression model takes the form of eq 6. The additive Free-Wilson log 1 / C = k i n Z

+ k 2 n + h,a + k 4 E , + Iz5

(6)

model is not appropriate in the case of significant parabolic dependence of log 1/C values on the lipophilic character of substancesgJ0 (note that Singer and Purcelilo demonstrated that the interaction model of Bocek and KopeckyllJ2 corresponds to the parabolic form of Hansch’s linear multiple regression model, eq 6). Based on the equivalence of eq 1and 3 the Hansch eq 1, 2, and 6 can be transformed in three different ways. Firstly, the nonparabolic part of eq 6 can be substituted by Cui + p (eq 3) to give eq 7. Equation 7 can be inlog

terpreted as a Free-Wilson equation with an additional term kia2 to account for parabolic Gependence qf log 1/12 on lipophilic character or as a Bocek-KopeckyW2 like equation, kia2 being the interaction term, or as a “mixed approach” with a Hansch part kia2 and a FreeWilson part Cui. Secondly, a linear combination of a Free-Wilson part Cui for substituents Xi (eq 3) and a Hansch part %,$j for substituents Yi (eq 2) gives a “mixed equation”, eq 8.

1/c= k * n 2 + ?ai + /A 1

(7)

1/c= c u i + F”$lj + k ’ 1

(8)

Thirdly, eq 3 and 6 (or eq 7 and 8) can be combined to eq 9 which is a mixed approach with a Free-Wilson part log

1/c=k * n 2 + ?ai + pl$lj + k‘ I 1

(9)

x a i for substituents Xi, a Hansch part Zkj4j for substituents Yi, and a term k i d accounting for parabolic dependence of log 1/C values on lipophilic character (note that a in k i 4 must be a% + ay). Equations 7,8, and 9 look rather complex at first sight, but in reality they are simple variations of FreeWilson’s additive model which make this approach more generally applicable; the Hansch part in eq 8 and 9 gives a reduction of the number of variables needed, if for one definite group of substituents a Hansch equation can be derived. The term k i r 2 in eq 7 and 9 makes the FreeWilson approach now applicable to all problems where nonadditivity of group contributions is caused by parabolic dependence of log 1/C on log P or a. Since values of log PI3 and 7r14J5 are known for a great number of compounds and substituents and can be estimated3J3 for new compounds and substituents, the use of k i d in eq 7 and 9 gives no restrictions for the general applicability of the mixed approach (some attention should be paid to possible nonadditivities of ir ~ a l u e s ; ~ in ~ Jcases ~ J ~of greater deviations from additivity experimental log P values should be used). Craig and Hansch18Jg discussed the different requirements for application of the linear multiple regression model and the additive model. In most cases the Hansch approach is the more general and useful model but there are also limitations for this approach; for certain groups of compounds today only the Free-Wilson approach can give correlationsbetween chemical structure and biological activity (for examples see ref 5 and 20). A further limitation of the Hansch approach comes from little structural variation in a definite position of the molecule; no meaningful Hansch correlations can be derived if only two or three substituents are in a position where they contribute to biological activity in a different manner related to substituents in other positions of the molecule. In these instances it might be useful to combine both models in the

Kubinyi

588 Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5

Table I. Hansch Correlations for de Novo Group Contributions Cammarata and Y ~ utetracyclines, , ~ ring D substituents n = 1 ; r = 0.93 a i = 0.81 u Z - 0.51 rv + 0.84 Fujita and Ban,s substrates of phenylethanolamine N-methyltransferase, aromatic substituents a i = 0.743 n + 0.881 a , t 0.323 n = 5; r = 0.906 a i = 0.903 n - 1.030 a 2 + 0.214 n=5;r=0.954 Craig,” 2-phenylquinoline-4-carbinols, substituents R,: a i = 0.220 n t 0.626 U , - 0.232 n = 7 ; r = 0.895 R,: a i = 1.811u p - 0,010 n=5;r=0.981 ai= 0.998 n - 0.198 n = 5 ; r = 0.942 n = 4; r = 0.966 R,: a i = 0.959 n - 0.395 Craig and Hansch,19 phenanthreneaminoalkylcarbinols, substituents R,: R, VS. n n = 6 ; r = 0.971 R, vs. u n = 6 ; r = 0.910 R,: R, vs. n n = 6 ; r = 0.910 R, vs. u n = 6 ; r = 0.885

form of eq 8 or 9, e.g., if one considers a group of compounds with aromatic substituents and substituents in a side chain; the aromatic substituents may be correlated with the Hansch parts of eq 8 or 9 and the substituents of the side chain may be correlated with the Free-Wilson parts of eq 8 or 9, if no significant correlations with structural parameters can be obtained for these substituents. It is surprising that, to my knowledge, no use has been made of eq 7 until today. On an empirical basis two different applications of eq 8 and 9 have been made in the past; these applications are the derivation of Hansch equations for de novo group contributions and the use of certain dummy variables (indicator variables) in Hansch analysis. Cammarata and Yau4 and Fujita and Ban5 were the f i t ones to demonstrate that analysis of Free-Wilson group contributions by the linear multiple regression model may lead to Hansch correlations which allow predictions of log 1/C values for compounds with substituents outside the original Free-Wilson matrix. The Hansch equations derived by these groups and some other successful correlations of Free-Wilson group contributions with structural parameters are given in Table I. A small number of different substituents in a definite position often makes it impossible to derive a meaningful

Hansch correlation. On the other hand, a Hansch equation obtained from de novo group contributions which include a number of single point determinations is not very reliable because group contributionsderived from a greater number of compounds and group contributions including the experimental error of a single compound have equal weights; according to eq 8 the mixed approach should be used in such cases to derive the correct Hansch correlation from the original log 1/C values. The second application of eq 8 and 9 is the use of FreeWilson like dummy variables (indicator variables22) in Hansch analysis. Different kinds of dummy variables have been used in Hansch and FreeWilson analyses: (a) dummy variables to account for specific structural properties (e.g., steric hindrance or conformational changes), (b) dummy variables to account for specific interactions between Substituents (e.g., hydrogen bonding), and (c) dummy variables to account for all changes in polar, electronic, and steric contributions if a substituent X is replaced by a substituent Y. A list of different dummy variables used in the literature is given in Table 11. While most authors used dummy variables without discussing a rational basis for their use, Hansch and Yoshimot.022 tried to give an explanation; they interpreted the function of dummy variables (indicator variables) in Hansch analysis as “the extrathermodynamic approach assisted by what is sometimes termed the Free-Wilson method and defined the Free-Wilson approach as a “quantitative structure-activity relationship formulated in terms of dummy variables” (it must be noted that a short discussion of the usefulness of dummy variables to combine different sets of compounds in a single Hansch equation was given by Franke and Oehme35). Equations 8 and 9 support the view of Hansch and Yoshimoto; a dummy variable which accounts for all changes in polar, electronic, and steric contributions if a substituent X is replaced by a substituent Y (type c) is unequivocally a Free-Wilson group contribution of the substituent Y, based on ax = 0.00; the use of such dummy variables in a Hansch equation is an implied application of the mixed approach. The situation is more difficult if dummy variables of type a are considered; these dummy variables may be interpreted as Free-Wilson group contributions from

Table 11. Dummy Variables (Indicator Variables) Used in Hansch and Free-Wilson Analyses Dummy variable

2 X 6

X* Ac T Et D D(4”) all)

D(B) NMe I X D-1 0-2 0-3 D X

D A a

See text.

Used to account for Number of hydrogens on protonated amine Differences between phenyl and benzoyl derivatives Stereoelectronic differences between meta and para substituents Differences between phenyl and 2-thienyl derivatives Hydrogen bonding between OH and OCH, Differences between OH and OCOCH, Differences between cis and trans isomers Steric and/or electronic differences between NMe and NEt Deviations found for m-OCH, Nonpolar differences between OH and OCOR Nonpolar differences between OH and OCOR Nonpolar differences between H and OH Differences between NH and NCH, Nonpolar differences between indan and tetralin system Different position of substituent X Bridge units between two aromatic rings, including the second ring Nonpolar differences between benzene and pyridine derivatives Special activating effect of -NHCOXC,H, in meta position Activating effect of an oxygen substituent in para position Differences in conformation between anilides and benzylamides Nonpolar differences between mono- and disubstituted amides Specific effects of a piperidine ring in substituted hydrazides

Type of dummy Ref variable’ a a a C

b C

a a a

a

23 24 25 26 5 27 27 27 28 29

a a

29 29

C

30 31

a

a C

a a a

a a a

31 22 22 22 32 33 34 34

Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5 589

Quantitative S A Relationships. 2

Table 111, Activity of Halogenated Phenylmethacrylates against Microorganisms

Compd no.

1 2 3 4 5

6 7 8 9

10 See ref 31. anomala. 37

R

H 2-c1 4-C1 3-C1 2,441 2,4,6-C1, 2,4,5-CI3 2,3,4,6-C1, CIS Br, vs. S. aureus.=

Log l / C values Log pa IIIab IIIbC IIICd IIIde IIIef 1.99 2.89 2.89 2.89 2.89 2.89 3.08 3.08 3.08 2.58 3.08 3.08 3.25 3.25 3.25 2.69 3.12 3.25 2.75 2.71 2.91 2.91 3.12 2.91 3.28 3.49 3.49 3.80 3.49 3.49 3.87 3.76 3.16 3.76 3.76 3.24 3.84 4.04 3.84 3.84 3.84 3.54 4.63 4.97 4.61 4.45 3.97 3.32 5.39 5.19 4.71 5.00 5.00 3.71 6.39 5.32 3.64 5.32 3.85 3.64 vs. S. faecalis.= vs. B. substilk.= e vs. B. cereus. 36 vs. Sarcina lutea.s

Table IV. Matrices Used for Free-Wilson and Mixed Analyses Compd no. R [Cl] [o-Cl] [m-Cl]

1 2 3 4 5

6 7 8 9 a

H 2421 4-C1 3-C1 2,4-C1, 2,4,6-C1, 2,4,5-C13 2,3,4,6-C1,

c4

0 1 1 1 2

3 3 4 5

0

0

1 0 0 1 2 1 2 2

0 0 1 0 0 1 1 2

Ip-Cl]

[O‘-Cl]

0 0 1 0 1 1 1 1 1

0 0 0 0 0 1 0 1 1

IIIfB 2.89 3.28 3.25 3.12 3.49 3.46 3.54 3.45 3.35 3.35 g vs.

H.

71 ‘5

ortho' a

0.00 0.71 0.71 0.71 1.42 2.13 2.13 2.84 3.55

1.24 1.24 1.24 1.24 1.24 0.27 1.24 0.27 0.27

Reference 38.

which one or more structural parameters have been “extracted”. The mixed approach (eq 7 , 8 , or 9) combines the linear free energy related Hansch model and the modified Free-Wilson mode1518 and makes use of the specific advantages of each model; the following examples are given to illustrate the application of the mixed approach. Example 1 is an application of the Free-Wilson approach and the mixed approach (in form of eq 7) to describe the activity of chlorinated phenyl methacrylates against several microorganisms. Hansch et al.36937 derived for compounds 1-10 (for structures, log P values, and log 1/12 values, see Table 111) the following equations (eq 10-15). vs. S. aureus (IIIa):

+

log 1/C = 0.668 log P 1.342 n = 10; r = 0.966; s = 0.262 vs. S. faecalis (IIIb):

+

log 1/C = -0.125 (log P ) 2 1.359 log P 0.415 n = 10; r = 0.861; s = 0.334 vs. B. subtilis (IIIc):

+

vs. Sarcina lutea (IIIe): log 1/C = 0.161 log P f 2.721 n = 10; r = 0.849; s = 0.148

(14)

vs. H. anomala (IIIf):

log 1 / C = - 0.102 (log P ) 2 + 1.234 log P - 0.880 u + 0.878 (15)

n = 10; r = 0.958; s = 0.069

While eq 10, 12, and 15 give good correlations of observed and calculated log 1/C values, the correlations found with eq 11,13, and 14 are not so good. In order to explain the reasons responsible for these lower correlation coefficients some Free-Wilson analyses and mixed analyses were run (the matrices used for both approaches are given in Table W, the pentabromo compound was excluded from all analyses to avoid single point determinations). Based on the assumption that all chlorine atoms give comparable group contributions the following FreeWilson correlations using [Cl] as single parameter were derived. vs. S. aureus (IIIa): log 1/C = 0.503 (k0.130) [Cl] + 2.578

(16)

n = 9; I^ = 0.960; s = 0.256; F = 8 3 . 0 6 ; ~< 0.001 vs. S. faecalis (IIIb):

+

log 1/C = 0.617 log P 1.530 n = 10; r = 0.976; s = 0.204 vs. B. cereus (IIId):

log 1/C = 0.359

(k

0.093) [Cl] + 2.732

(17)

n = 9; r = 0.961; s = 0.182; F = 8 4 . 0 3 ; ~< 0.001 vs. B. subtilis (IIIc):

+

log 1/C = 0.400 log P 2.144 n = 10; r = 0.815; s = 0.420

log 1/C = 0.443 (k0.107) [Cl] + 2.705 (18) n = 9; r = 0.966; s = 0.209; F = 9 6 . 6 5 ; ~< 0.001

590 Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5

vs. B. cereus (IIId): log 1 / C = 0.428 (k0.086) [Cl] + 2.680 (19) n = 9; r = 0.976; s = 0.169; F = 1 3 8 . 6 5 ; ~< 0.001 vs. Sarcina lutea (IIIe): log l / C = 0.127 (k0.079) [Cl] + 3.012 (20) n = 9; r = 0.819; s = 0.116; F = 1 4 . 2 5 ; ~< 0.01 vs. H . anomala (IIIf): log l / C = 0.088 (k0.082) [Cl] + 3.120 (21) n = 9; r = 0.693; s = 0.160; F = 6 . 4 6 ; ~< 0.05 Equations 16-19 show good correlations between observed and calculated log 1 / C values; there is no evidence for a parabolic dependence of log l / C on lipophilic character. The good correlations found with eq 17 and 19 are rather surprising because eq 11and 13 (including the pentabromo compound) gave smaller correlations. However, these smaller correlations are caused only by the pentabromo compound, as can be seen from eq 22 and 23, where the pentabromo compound was excluded. vs. S. faecalis (IIIb): log l / C = 0.535 (k0.142) l o g P + 1.674 (22) s = 0.186; F = 79.52; p < 0.001

n = 9; r = 0.959;

Kubinyi

chemical structure and biological activity for log 1/ C values IIIf. Equation 26 corresponds to the parabolic eq 15 (no rmax can be calculated from eq 26 because the first-order x term is “hidden” in the Free-Wilson parameter). For log 1/C values IIIe (activity vs. Sarcina lutea) no satisfactory correlation can be obtained either by Hansch analysis (eq 14) or FreeWilson analysis (eq 20) or mixed analysis (eq 25). Therefore a second Free-Wilson analysis was run with [o-Cl], [m-Cl],and [p-Cl] as parameters (see Table IV) to prove the assumption that all chlorine atoms have equal group contributions; additional 9 2 term did not log l / C = -0.001 (kO.185) [o-cl] + 0.168 (k0.185) [m-Cl] + 0.340 (k0.303) [p-Cl] + 2.974 (27) n = 9; r = 0.906; s = 0.136; F = 7 . 6 4 ; ~< 0.05 improve the correlation. Although eq 27 gives a somewhat better correlation than eq 14,20, or 25, the low coefficient found for the [o-Cl] term indicates that something may be “wrong” with the ortho position. Next a Free-Wilson analysis was run to prove the equivalence of 0- and 0’-chlorineatoms; the total number of chlorine atoms [Cl] and a dummy variable [o’-Cl] (see Table IV), accounting for ortho,ortho’-disubstitution, were used as parameters. log l / C = 0.231 (i0.082) [Cl] - 0.421 (k0.270) [o’-Cl]

vs. B. cereus (IIId):

+ 2.921

n = 9; r = 0.951; s = 0.091; F = 28.17; p

log l / C = 0.640 (k0.126) log P + 1.412 (23) 9; r = 0.977; s = 0.165; F = 1 4 4 . 9 6 ; ~< 0.001

IZ =

Whether the deviations found for the pentabromo compound are due to experimental error or to parabolic dependence of log 1/C values on lipophilic character or to specific changes in activity going from chlorine to bromine substitution cannot be decided with the available data. The significanceof eq 22 and 23 is supported by the fact that eq 10, 12, 22, and 23 have similar coefficients; indeed log 1/C values IIIa, IIIb, IIIc, and IIId (excluding the pentabromo compound) can be described by a single Hansch equation.

(28)

< 0.001

The [o’-Cl]term is a dummy variable of type a; its high negative value indicates that ortho,ortho’-disubstitution causes a significant reduction of biological activity. Since this reduction of activity may be due to a negative steric contribution of the second o-chlorine atom, log P and &ortho (see Table IV) were used to calculate eq 29. Of log 1 / C = 0.328 (k0.123) l o g P

+ 0.388 (20.278)

Esortho’ + 1.800 n = 9; r = 0.946; s = 0.095; F

(29) 2 5 . 6 3 ; ~< 0.005

=

(26) 4 1 . 1 6 ; ~< 0.001

course eq 28 and 29 may be chance correlations;39although the E ~ t h oterm ’ is statistically significant (t~,ofiho’= 3.42; p < 0.05) there is no further evidence that these compounds exert their action vs. Sarcina lutea by a specific mechanism. Some evidence for the biological significance of eq 29 may be derived from the fact that the biological activity of the pentabromo compound (E@ho = 0.08) is also well predicted by eq 29 (log 1/C obsd = 3.64; log 1 / C calcd = 3.93). The purpose of this example was to demonstrate the step-by-step analysis of biological activity data by FreeWilson analysis, mixed analysis, and finally Hansch analysis with a limited number of compounds. Example 2 demonstrates the reliability of the mixed approach (eq 7) in detecting a parabolic dependence of log 1/C values on lipophilic character also in cases where a nonparabolic Free-Wilson analysis gives a very high correlation of observed and calculated log l / C values. Hansch and Lien37 derived for the inhibitory activity of substituted phenols (for structures see Table V) against Aspergillus niger eq 30. Although the (log P)2 term is

Log I / C values IIIf (activity vs. H . anomala) show a significant parabolic dependence on lipophilic character ( t r z = 6.32; p < 0.001); a comparison of the correlation coefficients and the F values of eq 21 and 26 illustrates the superiority of eq 26 to describe the relationship between

log 1 / C = -- 0.190 (k0.085) (log P ) 2 + 1.859 (k 0.558) l o g P 0.627 (k0.395) u 0.092 (30) n = 18; r = 0.975;s = 0.160; F = 9 0 . 3 5 ; ~< 0.001

log 1/C = 0.646 (k0.071) l o g P + 1.394 (24) n = 36; r = 0.954; s = 0.217; F = 3 4 1 . 8 0 ; ~< 0.001 Equations 20 and 21 have low correlation coefficients; therefore a mixed approach (eq 7 ) was run for log 1/C values IIIe and IIIf, using r 2 and [Cl] as parameters. vs. Sarcinu lutea (IIIe): log 1 / C = -0.031 (20.120)

n

=

7 ~ ‘

[Cl] + 2.954 9; r = 0.831; s = 0.163; F

+ 0.204 (k0.314) =

6 . 7 0 ; ~< 0.05

(25)

vs. H. anomala (IIIf): log 1 / C = -0.119 (k0.046) n 2

n

=

[Cl] + 2.895 9; r = 0.965; s = 0.063; F

+ 0.387 (k0.120) =

+

Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5 591

Quantitatiue SA Relationships. 2

D

+

H

R

3 4

5 6 7 8 9

10

11 12 13 14 15 16 17 18

A mixed approach with an additional 1r2 term (matrix see Table VI) gave eq 33. The 8 2 term in eq 33 is sta-

R

Log Pa

2

ua

Log 1/c obsd'

H 4 -C1 2-Me 2-Me, 4-C1 3-Me 3-Me, 4 4 1 2,6-Me2 2,6-Me2, 4-C1 3,5-Me2 3,5-Mez, 4-C1 2-i-Pr 2-i-Pr, 4-C1 2-t-B~, 5-Me 2-t-B~, 4431, 5-Me 2-Cyclohexyl 2-Cyclohexyl, 4-C1 2-Phenyl 2-Phenyl, 4-C1

1.46 2.39 1.96 2.89 2.02 2.95 2.46 3.39

0.00 0.93c 0.56 1.49 0.56 1.49 1.12 2.05

0.00 0.23 -0.14 0.09 - 0.07 0.16 - 0.28 - 0.05

2.35 3.35 2.70 3.70 2.68 3.70 3.35 4.40

log 1 / C = -0.124 (kO.091)T' 1.235 [p-Cl] 0.674 [o-Me] 1.541 [o-i-Pr] 1.726 [o-t-Bu] 2.522 [o-C6Hll] + 2.263 [o-Phe]

2.58 3.51

1.12 2.05

-0.14 0.09

3.26 4.22

2.76 3.69 3.70

1.53 2.46 2.54

-0.23 0.00 -0.59

3.35 4.30 3.70

4.63

3.47

- 0.36

4.30

3.97

2.51

-0.23

4.00

4.90

3.44

0.00

4.40

3.59 4.52

1.96 2.89

0.00 0.23

4.10 4.52

tistically significant (t,z = 3.11; p < 0.05); although there is no significant difference in the correlation coefficients obtained with eq 32 and 33, the higher F value of eq 33 supports the parabolic dependence of log 1/C values on lipophilic character. A mixed approach with a different 1r value for the p-C1 substituent (T = 0.7115 instead of 1r = 0.93, see Table V) gave a comparable result ( r = 0.988; s = 0.138; F = 47.14; p < 0.001;t$ = 2.93; p < 0.05),indicating that the selection of T values is not very important in this case. Example 3 is taken from a recent paper on structure-activity correlations among rifamycin B amides and hydrazides.34 Only the substituted hydrazides will be considered in this example to demonstrate the usefulness of the mixed approach and to show some possible pitfalls in Hansch analysis of an inhomogeneousset of compounds. Quinn et al.34 derived for the antibacterial activity of substituted rifamycin B hydrazides vs. five different bacterial systems (structures and log 1/C values, see Table VII) structure-activity correlations but they stated that the large amount of collinearity between their structural parameters log P, Q*, and Esmakes it impossible to draw any firm conclusions about the real structure-activity relationships. Compounds 52-66 and 68-71 were selected for FreeWilson analysis and mixed analysis (for matrix see Table VIII); compounds 50, 51, and 67 were excluded to avoid single point determinations and compounds 72-75 were excluded because of nonexplicable deviations of log 1/C values for compounds 73 and 75 (compare ref 34). The group contributions found by FreeWilson analysis are given in the following equations (all group contrib-

Compd no.

1 2

+

+

log 1/C = 0.822 [p-Cl] 0.526 [o-Me] 1.051 [o-i-Pr] 0.757 [o-t-Bu] + 1.426 [O-C6Hll] + 1.536 [o-Phe] + 0.470 [m-Me] + 2.363 (32) n = 18;r = 0.977;s = 0.183;F = 29.93;~ < 0.001

Table V. Growth Inhibitory Activity of Substituted Phenols vs. Aspergillus niger. Structures, Structural Parameters and Log 1 / C Values

+

+

+

0.618 [m-Me] + 2.187 n = 18;r = 0.989;s = 0.134;F

From substituted benzenes, ref 15, a Reference 37. unless otherwise indicated. From the phenol system, ref 14.

statistically significant (t(iog P)* = 4.78; p < 0.001) and its numerical contribution to log 1/C values is high, the biological activity of this group of compounds can also be described by a nonparabolic relationship (eq 31). log 1/C= 0.619 (k0.146)l o g P + 1.715 (31) n = 18;r = 0.914;s = 0.275;F = 80.79;~ < 0.001 Therefore a high correlation coefficient is obtained, due to the greater number of parameters, from a nonparabolic Free-Wilson analysis (matrix see Table VI).

=

+

+

(33) 50.10;~ < 0.001

Table VI. Matrix Used for Free-Wilson and Mixed Analysis Compd no. 1

[p-C1]

2 3 4 5 6 7 8 9 10 11 12

[o-Me1

[o-i-Pr]

TT

[o-C6Hll]

[o-Phe]

[m-Me]

TZU

0.00 1 1

1 1 1 1

1

1

2 2 2 2

1 1

1 1

13

14 15 16 17 18 values of Table V

[o-t-Bu]

1 1

1 1

1

1 1 1 1

1

were used t o calculate T * .

1

+

0.86 0.31 2.22 0.31 2.22 1.25 4.20 1.25 4.20 2.34 6.05 6.45 12.04 6.30 11.83 3.84 8.35

592 Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5

Kubinyi

Table VII. Structures and Antibacterial Activities of Substituted Rifamycin B Hydrazides vs. Several Bacterial Systems (All Values Taken from Ref 34) Log l/Cvalues Compdno.

R, R' H H Me Me Me Me Et Et Et Et Pr Pr Pr Pr Bu Bu Bu Pentyl Me Et Pr Bu Me Et Pr Bu a vs. M. aureus. vs. S. faecalis.

R,

50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75

VIIP

VIIbb

VIIcc

VIIdd

VIIee

Log P

U*

6.98 6.93 8.13 8.62 8.94 8.87 8.62 8.93 8.87 8.80 8.43 8.94 8.48 8.49 8.93 8.64 8.66 8.64 8.63 8.64 8.64 8.65 8.23 7.64 8.10 7.65

6.04 5.93 7.21 7.62 7.76 7.95 7.44 7.63 7.17 7.96 7.60 1.63 7.65 7.66 1.68 7.94 7.66 I .94 7.63 7.64 1.94 7.96 7.68 6.68 1.17 6.70

7.11 7.23 8.61 8.62 8.64 8.65 8.62 8.63 8.87 8.96 8.60 8.64 8.48 8.61 8.33 7.94 8.66 8.09 7.93 7.94 7.94 8.11 7.76 8.86 8.25 7.65

5.40 5.44 6.65 6.96 6.98 7.30 6.62 7.63 7.29 7.31 6.65 7.64 7.17 7.49 1.28 7.64 7.31 7.28 6.98 6.98 7.29 7.30 6.36 6.06 6.38 5.78

6.71 7.36 6.65 6.35 6.37 7.00 6.62 6.76 6.69 7.00 6.73 6.76 7.00 7.01 6.76 6.69 7.00 6.68 6.98 7.28 7.29 7.00 6.68 6.37 6.38 5.78

- 1.93 0.21 - 0.43 0.57 1.57 2.57 0.07 1.07 2.07 3.01 0.57 1.57 2.57 3.57 1.01 2.07 3.07 1.57 0.67 1.17 1.67 2.17 - 1.26 - 0.76 -0.26 0.24

1.47 1.58 0.00 -0.20 -0.23 - 0.26 -0.10 - 0.30 -0.33 - 0.36 -0.12 - 0.32 - 0.34 - 0.38 -0.13 - 0.33 - 0.36 - 0.13 - 0.22 - 0.32 -0.34 -0.35 0.18 0.08 0.06 0.05

vs. S. hemolyticus.

vs. B. subtilis. e vs. M. tuberculosis.

Table VIII. Matrix Used for Free-Wilson and Mixed Analysis (All Group Contributions Based on a~~ = 0.00) Compd no. 52 53 54b 55 56 57 58 59 60 61 62 63 64 65c 66 68 69 70 71

[Et]

[Pr]

[Bu]

2 2 2

1

2 1 1 3 1

2

2 2

2 1 1 1 1

1 1 1

1 1 1

0.18 0.32 2.46 6.60 0.00 1.14 4.28 9.42 0.32 2.46 6.60 12.74 1.14 4.28 9.42 0.45 1.37 2.79 4.71

Log P values of Table VI1 were used to calculate (log Compound not included in calculation of eq 38. Compound not included in calculation of eq 36.

P)'.

utions based on

UMe

log 1/C= 0.072 [Et] + 0.041 [Pr] + 0.094 [Bu] -

[Pip] ( L o g P ) 2 a

2 1 3 1

vs. S. hemolyticus (VIIc):

= 0.00).

vs. M . aureus (VIIa):

log 1 / C = 0.129 [Et] + 0.070 [Pr] + 0.100 [Bu] + 0.095 [Pip] 8.471 (34) n = 19; r = 0.493; s = 0.209; F = 1 . 1 2 ; ~> 0.1

+

vs. S. faecalis (VIIb):

log 1/C = 0.098 [Et] + 0.101 [Pr] + 0.210 [Bu] + 0.296 [Pip] + 7.392 (35) n = 19; r = 0.768; s = 0.140; F = 5 . 0 2 ; ~= 0.01

n

=

0.573 [Pip] + 8.501 (36) 18; r = 0.930; s = 0.132; F = 2 0 . 7 7 ; ~< 0.001

vs. B. subtilis (VIId): log 1 / C = 0.300 [Et] + 0.209 [Pr] + 0.349 [Bu] + 0.347 [Pip]

+ 6.576

n = 19; r = 0.847; s = 0.194; F

=

(37) 8 . 8 6 ; ~< 0.001

vs. M . tuberculosis (VIIe):

log 1/C = 0.007 [Et] + 0.119 [Pr] + 0.160 [Bu] + 0.472 [Pip]

+ 6.594

n = 18; r = 0.861; s = 0.139; F

=

(38) 9 . 2 7 ; ~< 0.001

A mixed analysis of log 1/C values VIIa and VIIb (for matrix see Table VIII) gave eq 39 and 40. vs. M. aureus (VIIa):

log 1 / C = -0.137 (k0.072) (log P)' + 0.255 [Et] + 0.444 [Pr] + 0.770 [Bu] + 0.362 [Pip] + 8.231 (39) n = 19; r = 0.821; s = 0.142; F = 5 . 3 7 ; ~< 0.01; t ( l o g p ) 2 = 4 . 1 5 ; ~< 0.005 vs. S. faecalis (VIIb):

log 1 / C = -0.099 (k0.042) (log P)' + 0.189 [Et] + 0.371 [Pr] + 0.694 [Bu] + 0.489 [Pip] + 7.219 (40) n = 19; r = 0.928; s = 0.084; F = 1 6 . 1 6 ; ~< 0.001; t ( l o g p ) z = 5.05; p < 0.001

Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5 593

Quantitatiue S A Relationships. 2

The statistically significant (log P)2terms and the higher

F values of eq 39 and 40, compared with eq 34 and 35, indicate a parabolic dependence of log 1/C values on lipophilic character for the activity of the trisubstituted hydrazides vs. M. aureus and S. faecalis. No parabolic dependence could be detected for log 1/C values VIIc, VIId, and VIIe (activity vs. S. hemolyticus, B. subtilis, and M . tuberculosis); all (log P)2 terms were statistically not significant (p > 0.1). A matrix using different FreeWilson parameters for Ri and R2/R3 gave only somewhat higher correlation coefficients than eq 34-40 but lower F values, indicating that the group contributions of the alkyl substituents are not significantly different in position Ri and positions Rz/R3. The fact that the biological activity of the trisubstituted hydrazides vs. M. aureus cannot be described by a nonparabolic equation (eq 34; p > 0.1) was surprising because Quinn et al.34 derived eq 41 for compounds 50-72 and 74,

+

log 1/C= -0.96 (k0.16) U * 8.42 n = 24; r = 0.937; s = 0.189

(41)

No equation with a statistically significant u* term could be derived; a correlation using (log P)2, log P, and Ea (n = 19; r = 0.797; s = 0.140; F = 8.73; p < 0.005) is not presented here because it may be a chance correlation due to the collinearity of log P and Ea (see ref 34). The low correlation coefficient of eq 45 is not surprising if the small variance of log 1/C values (0.8 log units) and the usual experimental error (ca. 0.2 log units) are taken into account. Of course eq 45 and 39 are valid only for trisubstituted hydrazides. Because of the Collinearity problem34 no attempts were made to derive Hansch correlations for the biological activity of the trisubstituted hydrazides vs. S. faecalis, S. hemolyticus, B. subtilis, and M. tuberculosis (it should be noted that the use of a dummy variable A for the piperidine substituent34 is confirmed by the group contributions found with eq 36 and 38). Example 4 illustrates the application of the mixed approach in the form of eq 8 and 9.

+ ?kjQj + k‘ I log 1/C = k l n 2 + ?ai + ?kjq5j $. h’ log 1/C = ?ai I

which is a nonparabolic equation with a significantly better correlation coefficient than eq 34 (r = 0.493). A Hansch analysis of compounds 52-66 and 68-71 (set of compounds used for Free-Wilson and mixed analysis) with ut as single parameter gave eq 42. log 1/C= -0.72 (k0.93) U * + 8.49 (42) n = 19; r = 0.370; s = 0.202; F = 2 . 7 0 ; > ~ 0.1

I

I

log 1/C -0.96 (20.16) U * + 8.43 (43) n = 21; r = 0.941; s = 0.193; F = 1 4 7 . 3 2 ; < ~ 0.001 The great differences found for the correlation coefficients and the F values of eq 42 and 43 are caused only by compounds 50 and 51; that means that eq 43 only can describe the differences in the biological activities between compounds 50 and 51 on one hand and compounds 52-66 and 68-71 on the other hand. According to this a correlation can be established for compounds 50-66 and 68-71 using an indicator variable D as unique parameter (D = 1for all trisubstituted hydrazides; D = 0 for compounds 50 and 51). log 1/C = 1.72 (50.32) D + 6.96 (44) TI = 21; r = 0.933; s = 0.206; F = 1 2 7 . 0 2 ; < ~ 0.001 From eq 44 it is evident that eq 41, 42, and 43 are without any significance in predicting differences in biological activity among the trisubstituted hydrazides 52-66 and 68-71; eq 41 and 43 are chance correlations.39 This example demonstrates that care must be taken in the selection of compounds for Hansch analysis; compounds with different structures and very different activities (such as compounds 50 and 51) should be excluded if they have such a great influence on the overall result (note that Tutem came to a similar conclusion discussing a theoretical example, ref 40, pp 41-42). Including only trisubstituted hydrazides the following equation can be derived for compounds 52-66 and 68-71.

+

log 1 / C = -0.111 (50.062) ( 1 0 g P ) ~ 0.411 (20.213) log P + 8.427 (45) n = 19; r = 0.715;s = 0.157; F = 8 . 3 8 ; < ~ 0.005

(9)

Hansch and Clayton9 derived for the bactericidal activity of substituted benzylammonium compounds against S. aureus (for structures and log l / C values see Table IX; for log P values see ref 9) eq 46. log 1 / C = -0.17 (k0.03) (l0gP)’

If compounds 50 and 51 are included, eq 43 is obtained for compounds 50-66 and 68-71.

(8)

+ 0.87 (k0.14)

l o g P + 2.93

n = 45; r = 0.884; s = 0.306

(46)

In order to find a better correlation between structure and biological activity two mixed analyses were run (correspondingto eq 8 and 9) with a Free-Wilson part Cui for the aromatic substituents Xi (2-C1,3-C1,4-C1,4-N02, -0CH3 and 3,4-OCH20-), a Hansch part k 2 a ~for the aliphatic side chains R (Cs, Cio, Ci2, Ci4, Cis, Cia), and an additional term kiT2 to account for parabolic dependence of log 1/C values on lipophilic character (for matrix see Table X; compound 45 was excluded to avoid single point determinations). The aliphatic side chains are described by TR because there is a high degree of probability that the change in lipophilic character is the only physical parameter influencing the biological activity in going from a Cs side chain to longer side chains; the aromatic substituents Xi are described by FreeWilson parameters in order to check the possibility whether electronic and/or steric effects of these substituents may contribute to biological activity. A nonparabolic mixed approach (eq 8) gave no statistically significant correlation (for matrix see Table X). log 1/C= 0.150 (k0.394) T R - 0.025 [2-C1] + 0.095 [4-C1] + 0.008 [4-NOz]- 0.004 [3-C1] + 0.079 [OCHB] - 0.047 [OCHZO]

3.136

n = 44; r = 0.407; s = 0.632; F = 1 . 0 2 ; > ~ 0.1

+ (47)

This is a further confirmation for the nonvalidity of the Free-Wilson approach in the case of significant parabolic dependence of log 1/C values on lipophilic characteraJ0 (eq 47 is indeed a FreeWilson type equation because the TR term may also be interpreted as a Free-Wilson parameter of a -CH2CH2 group, based on the ca compound).

Kubinyi

594 Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5

Table IX. Bactericidal Activity of Substituted Benzylammonium Compounds vs. S. aureus. Structures and Log 1/C Values9

Compd no.

X

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

H H H H H 2421 241 2-c1 2421 2-c1 2-c1 4-C1 441 4-C1 4-C1 4-Cl 4-C1 4-NOz 4-NOz 4-NOz 4-NOz 4-NOz 4-NO-

R

Log l / C o b s d 2.79 3.74 4.17 3.92 3.34 2.02 3.41 4.14 4.14 3.74 3.43 2.60 3.60 4.04 4.34 3.85 3.17 2.11 3.11 4.20 4.23 4.00 3.46

The parabolic mixed approach (eq 9) gives a statistically significant correlation (eq 48); the values found for the

+

log l / C = -0.180 (k0.030) 71’ 1.293 (k0.198) n R + 0.760 [2-C1] + 1.004 [4-C1] + 0.449 [4-N02] + 0.995 [3-C1]- 0.010 [OCH3] + 0.123 [OCHZO] + 1.709 (48) n = 44; r = 0.916; s = 0.281; F = 2 2 . 9 3 ; ~< 0.001 individual group contributions are very different from those of eq 47. The 4 term of eq 48 is statistically significant ( t ~ = z 12.14; p < 0,001). A parabolic Hansch analysis using the same a values (from the phenoxyacetic acid system,lA compare Tables X and XI) gives eq 49. log 1/C = -0.174 (k0.028) 71’ + 1.253 (k0.186) 71 + 1.836 (49) n = 44; r = 0.907; s = 0.273; F = 9 5 . 5 2 ; ~< 0.001 The identity of the correlation coefficients found with eq 48 and 49 ( r = 0.916 vs. r = 0.907) indicates that electronic or steric effects of the aromatic substituents have no influence on the biological activity. An interesting effect is observed if different a values are used for the Hansch analyses; a values from the benzene system15 (a values I1 of Table XI) give a somewhat smaller correlation (eq 50), arbitrarily chosen R values (7values I11 of Table XI) give a better correlation (eq 51) than the a values used for the calculation of eq 49. a values I1 (see Table XI)

log 1 / C = - 0 . 1 5 3 ( i 0 . 0 3 1 ) 71’ + 1.099 (20.202) 71 + 2.081 (50) n = 44; r = 0.868; s = 0.323; F = 62.33; p < 0.001

Compd no.

X

R

24 25 26 27 28 29

LogliCobsd 2.63 3.65 4.25 4.28 3.41 3.30 2.92 3.79 4.36 4.04 3.39 3.11 3.71 4.11 3.41 2.04 3.08 4.00 4.23 4.20 3.23 2.70

C8

c,, ClZ c,, ‘16

C18 C8 CIO CIZ Cl,

30

31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

16

Cl8 Cl, Cl, ‘16

C8 ClO CiZ 14

c,, Cl8 CI,

a values 111 (see Table XI)

log l / C = - 0.209 ( i 0 . 0 2 3 )

+ 1.340 (k0.137)

71 + 2.011 (51) n = 44; r = 0.952; s = 0.198; F = 1 9 9 . 7 1 ; ~< 0.001

From the different correlations found with eq 49,50, and 51 one can see that the selection of T values may be critical in such cases where the influence of the a2 term is significant. In the given example the variance of the a values due to the chain length of the aliphatic residue is 5 log units; therefore the decision to use a = 0.24 or 7 = -0.28 for the p-NO2 group (see Table XI) has great influence on the a2 values of the higher members. The better fitting of the arbitrarily chosen a values (T values 111, Table XI) indicates that nonadditivity due to chain foldingl3Jr may be the reason for the smaller correlations found with eq 49 and 50 (note that the arbitrarily chosen a values are significantly smaller than the other a values). Example 5 is a further reinvestigation8 of the thyroxine-like activity of thyronine derivatives I with the mixed approach (eq 8). It is a fascinating problem to Ho +-

0 x2

:b \ /

CH2CI H c 0 0 H ”2

y2 1

establish structure-activity relationships for this group of compounds for three reasons. Firstly, a specific receptor with a definite structure and definite properties can be assumed for the site of action of a hormone and its analogues; therefore not only lipophilic character but also electronic and steric properties might have an influence on the biological activities of these compounds. Secondly, a large number of structurally similar compounds have been synthesized and tested,some

Quantitative S A Relationships. 2

Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5 595

Table X. Matrix Used for the Mixed Analyses (Eq 47 and 48) Compdno.

R R ~

[2-C1]

[4-C1]

[4-N02]

[3-C1]

[OCH,]

[OCH,O]

nzb

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

1 1.00 4.00 2 9.00 3 4 16.00 5 25.00 0 1 43.35 1 1 2.53 2 1 6.71 3 1 12.89 4 1 21.07 5 1 31.25 1 0 0.49 1 1 2.89 1 2 7.29 1 3 13.69 1 4 22.09 1 5 32.49 1 0 0.06 1 1 1.54 1 2 5.02 1 3 10.50 1 4 17.98 1 5 27.46 1 0 1 1.66 1 1 1 5.24 1 2 1 10.82 1 3 1 18.40 1 28 4 1 27.98 1 29 5 1 39.56 1 30 0 1 2.13 1 31 1 1 6.05 1 32 2 1 11.97 1 33 3 1 19.89 1 34 4 1 29.81 1 35 5 1 41.73 36 2 2 4.33 37 3 2 9.49 38 4 2 16.65 39 0 1 0.00 40 1 1 0.90 41 2 1 3.80 42 3 1 8.70 43 4 1 15.60 44 5 1 24.50 The C, compound was taken as basis compound ( n = 0.00) and 0.50 was added for each additional -CH, group. n = R X + n ~ n values ; from the phenoxyacetic acid system14 ( n values I of Table XI,see below) were used to calculate the n z values. Table XI. n Values Used for Calculation of Eq 48-51 (in All Cases ?r = 0.50 was Used for the Aliphatic -CH, Group) ?r

Substituent 2-c1 341 4-C1 4-NO, 3,4-(OCH,), 3.4-OCH-O-

values

Ia

IIb

0.59 0.76 0.70 0.24 0.08 -0.05b

0.71 0.71 0.71 -0.28 -0.04 - 0.05

IIIC 0.2 0.4 0.4 0.0 0.0 0.0

a n values from the phenoxyacetic acid system14 unless n values otherwise indicated (used for eq 48 and 49). from the benzene systemlS (used for eq 50). Arbitrarily chosen n values to get a high correlation between observed and calculated log 1/C values (used for eq 51).

of them being as active as the natural hormones, so that enough data are available for the derivation of quantitative structure-activity correlations. Thirdly, many qualitative structure-activity relationships have been discussed for this group of compounds41-4s and also some attempts were made to establish quantitative structure-activity correlationsll*~47-49but until today no quantitative correlation has been given which is valid for halogen- and alkyl-substituted thyronines.

In the preceding papers a Hansch equation with an additional dummy variable D was derived for 13 halogen-substituted thyronine derivatives (eq 52, for logA = 2.54 (k0.71) n, + 2.91 (k0.83) E S z l0.35 (k0.30) D - 3.72 (52) n = 13; r = 0.942; s = 0.408; F = 2 3 . 4 5 ; ~< 0.001 structures and structural parameters see ref 8). The original log A values used by Bruice et a1.47 and Hansch and Fujital were taken to illustrate the usefulness of the modified Free-Wilson approach to derive significant Hansch equations. No parabolic dependence of log A on lipophilic character could be found for this group of compounds. Since the paper of Bruice et al.47 and Hansch and Fujita's reinvestigation1 of these data, a number of additional thyronine derivatives have been synthesized and tested;42143146150-52 therefore the attempt was made to establish quantitative structure-activity correlations for the complete set of compounds. Structures and log A values are given in Table XI1 (note that some log A values are different from those of ref 8 because newer values43 are used); the parameters used for Hansch analysis and

596 Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5

Kubinyi

Table XII. Thyroxine-Like Activity of L-Thyronine Derivatives (Structure I) in the Rat (L-Thyroxine = loo%, assuming L = 2 X DL43) Compd no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

X," H H H H H H H H H H H H F c1 Br I F H H H Br I H H H H

x2a

y 2

OH H

Br I I Br I

I I I

CH(CH3)2

Br 1

I I

I I I I I I I I I I I I I I I I I Br Br Br Br Br I

Mean biol act. in the rat 1.5 10 8

21 138 545 126 506 868 120 60 32 1.a 0.9 10

1OOb 28 44 130 450 0.51 12.5 40 5 3 18 2 0.1

Log A 0.18 1.00 0.90 1.43 2.14 2.14 2.10 2.10 2.94 2.08 1.78 1.51

0.26 -0.05 1.oo

2.00 1.45 1.64 2.11 2.65 -0.29 1.10

1.60 0.70 0.48 1.26 0.30 - 1.00

Ref 43 43 43 43 43 43 43 43 43 43 43 43 41 47 47 b

47 47 47 50 47 41 52 46 46 46 46 41

In case of different substituents XI and X2 the smaller substituent was taken as X, and the larger substituent was taken as X2. Reference compound. Table XIII. Parameters Used for Hansch Analysis and Mixed Analysis of Thyronine Derivatives (Structures see Table XII)a b Compd no. ?fX OXC EGld EsXzd E,X 'One E,' f [Ilg t CH3 1g - 0.37 0.69 1 - 0.67 1.24 0 1.24 2 0 0.00 1.24 1.24 0 1.24 2 0 2 0.00 3 0.14 0.06 1.24 0.78 0 1.24 2 0 4 0.71 0.23 1.24 0.27 0 1.24 2 0 5 0.86 0.23 1.24 0.08 0 1.24 2 0 6 1.12 0.18 -0.16 0 1.24 1.24 2 0 7 0.56 - 0.17 1.24 o.oq 0 1.24 2 0 8 1.02 -0.15 - 0.07: 0 1.24 1.24 2 0 9 1.53 -0.15 1.24 - 0.47' -0.31 0.93 2 0 -0.14 2 0 10 1.98 - 0.20 - 1-54'. -1.38 1.24 11 2.00h 0.47 2 0 -0.12' 1.24 - 0.93' --0.77 12 1.96 - 0.01 1.24 - 2.58 - 2.42 -- 1.18 2 0 13 0.28 0.12 0.78 2 0 0.78 0 0.78 14 1.42 0.46 0.27 2 0 0.27 0.27 0 15 1.72 0.08 2 0 0.46 0.08 0.08 0 16 2.24 0.36 - 0.16 2 0 - 0.16 -0.16 0 17 1.26 0.24 0.78 0.78 2 0 - 0.16 0 18 0.86 0.23 1.24 0 1.24 0 0 0.08 19 1.12 0.18 1.24 0 1.24 0 0 - 0.16 20 1.53 - 0.15 0 0 1.24 - 0.47j -0.31 0.93 21 1.72 0.46 0.08 0 0 0.08 0.08 0 22 2.24 0.36 -0.16 0 0 - 0.16 - 0.16 0 23 1.12 1 1 0.18 1.24 0 1.24 - 0.16 24 1.12 0.18 1.24 -0.16 0 1.24 0 2 25 0.56 -0.17 1.24 0.0ol 0 1.24 0 2 26 1.53 - 0.15 - 0.47: 1.24 --0.31 0.93 0 2 27 1.12 - 0.34 0,ooj 0 0.00 0 2 0.00' a Compound 28 was excluded to avoid single point determinations in the mixed analyses. Reference 15. u p values, ref 15. Reference 38 unless otherwise indicated. e See eq 54. f Sum of.E,X, and Es~,'oP, see eq 57. g Free-Wilson parameters, group contributions based on ugr = 0.00. Estimated value. Reference 53. Reference 54.

mixed analysis are given in Table XIII. For the choice of substituent parameters some assumptions and approximations have to be made; since Tortho values are known only for a limited number of substituents, T values from the benzene system15 were used for the calculations (use of the Tortho values from the phenol system14 for substituents F, C1, Br and I gave very

similar results in all correlations); as u values, upara values were used for the ortho substituents (note that up = 0.18 was used for 115755). Hansch analysis of compounds 1-8 (only trisubstituted compounds; compounds 1-9 or 1-10 or 1-12 gave no equations with statistically significant ux terms) gives eq 53. The correlation of observed and calculated log A

Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5 597

Quantitatiue SA Relationships. 2 Table XIV. Comparison of Observed and Calculated Log A Values Compd no.

LogA obsd

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

0.18 1 .oo 0.90 1.43 2.14 2.74 2.10 2.70 2.94 2.08 1.78 1.51 0.26 -0.05 1.00 2.00 1.45 1.64 2.11 2.65 -0.29 1.10 1.60 0.70 0.48 1.26 0.30

Log A , calculated from Eq 53

A

Eq 55

A

Eq 59

A

Eq 61

A

0.19 0.83 0.99 1.75 2.01 2.54 2.05 2.83 3.72a

0.01 -0.17 0.09 0.32 -0.13 -0.20 -0.05 0.13 0.78

0.19 0.86 1.02 1.76 2.01 2.51 2.00 2.75 3.07 2.05 3.03b

0.01 -0.14 0.12 0.33 -0.13 -0.23 -0.10 0.05 0.13 -0.03 1.25 -1.51

0.18 0.80 0.96 1.69 1.94 2.46 1.99 2.75 3.09 2.10

0.00 -0.20 0.06 0.26 -0.20 -0.28 -0.11 0.05 0.15 0.02

0.39 0.86 0.98 1.61 1.84 2.33 2.00 2.69 3.03 2.22

0.21 -0.14 0.08 0.18 - 0.30 - 0.41 - 0.10 - 0.01 0.09 0.14

O.OOb

0.32 0.92c 1.11 1.72 1.83

a Not included in calculation of eq 53. Not included in calculation of eq 61.

log A = 1.746 (k0.421) 0.834

71,

Not included in calculation of eq 55.

- 1.410 (k1.176)

U,

+ (53)

n = 8; r = 0.982; s = 0.207; F = 6 6 . 4 2 ; ~< 0.001

values is very good (see Table XIV); eq 53 is the quantitative expression of the statements made by Jorgensen43 that biological activity increases with lipophilic character and that for equal lipophilic character the electron-releasing alkyl substituents lead to more activity than the electron-attracting halogen substituents (the anomalous position of the fluorine is caused by its smaller Q value). If eq 53 is used to calculate the log A value of compound 9, the value obtained is much too high (log A calcd = 3.72; see Table XIV); for compounds 10-12 even higher log A values are calculated from eq 53. Therefore, an additional term has to be included in eq 53 to account for the lower activity of compounds with large, lipophilic Xz substituents. Two different factors may be responsible for this lower activity: a parabolic dependence of log A values on lipophilic character and/or a steric hindrance for substituents larger than iodine. Although the use of an additional ~2 term gave significant correlations for compounds 1-12 (n = 12; r = 0.914; s = 0.394; F = 13.61; p < 0.005) as well as for compounds 1-17 ( n = 17; r = 0.753; s = 0.658; F = 5.69; p < 0.05), a ~2 term was without any significance when compounds 18-27 were included in the calculations (compare ref 8). The use of Es,x:!as an additional parameter gave no correlations with significant Es,x2terms; this is not surprising because eq 53 indicated no dependence of biological activity on the size of the Xz substituent for groups similar in size or smaller than iodine. Only substituents larger than iodine cause lower than predicted activities. Therefore the hypothesis was made that the thyroxine receptor has a “pocket” for the X2 substituent, great enough for the iodine atom, which is the Xz substituent of the natural hormone, and for substituents with equal

d d

C

c

0.06 0.97 0.11 -0.28 0.38

0.42 0.91d 1.09 1.71 1.77 1.49 1.98 2.68 0.74d 1.36 1.59 0.85 0.52 1.55 -.0.18

0.16 0.96 0.09 - 0.29 0.32 -0.15 -0.13 0.03 1.03 0.26 -0.01 0.15 0.04 0.29 - 0.48

Not included in calculation of eq 59.

or smaller size. To get a quantitative measure for this hypothesis that only Substituents larger than iodine cause steric hindrance, modified Es values were defined.

(54) (Only negative values of E s , x 2 c oare ~ considered; positive values indicate that there is no steric hindrance; therefore all positive values were put to zero.) With these modified Es values (accounting only for the steric hindrance of X2 substituents larger than iodine) eq 55 is obtained for compounds 1-10.

+

logA = 1.673 (k0.324) n, - 1.242 (k0.969) U, 1.714 (k0.600) E S X 2C o r n 0.856 (55) n = 10; r= 0.984; s = 0.201; F = 5 9 . 5 3 ; ~< 0.001

+

Again the correlation of observed and calculated log A values is very good (see Table XIV). Greater deviations are obtained if log A values for compounds 11 and 12 are calculated from eq 55, but these deviations may be caused by the uncertainties of Es values for the isobutyl group (compound 11) and the phenyl group (compound 12). The isobutyl group: if a “pocket” for the X2 substituent is assumed, the steric hindrance for an isobutyl group may be of the same order or even greater than for a tert-butyl group; for normal chemical reactions the steric hindrance of an isobutyl group is smaller than that of a tert-butyl group due to the greater flexibility of the isobutyl group. The phenyl group: Kutter and Hanschsa calculated two different Es values for the phenyl group: Es = -2.58 (function coplanar to reaction center) and EB = 0.23 (function perpendicular to reaction center). Although the actual ES value will be nearer the value of -2.58 there remains a high degree of uncertainty about the real ES value as long as the structure of the receptor is unknown. For these reasons compounds 11 and 12 were excluded from the following calculations (it must be noted that

598 Journal of Medicinal Chemistry, 1976,Vol. 19, No. 5

Greenberg et al.42 discussed the relationships between the bulk of the X2 substituent and the biological activity for several alkyl-substituted thyronines). Tetrasubstituted thyronines (e.g., compounds 13-17,21, 22) are less active than the corresponding trisubstituted thyronines. While Hansch and Fujital tried to correlate this fact with a parabolic dependence of log A on lipophilic character, Jorgensen41343 assumed a steric hindrance for the second X substituent (Xiin structures I and 11) and gave an excellent evidence for this assumption; he investigated 2'-methylthyronines (structure 111, which are fixed in the presented conformation due to the steric

Kubinyi Table XV. Dependence of Biological Activity on Substituents Y

Y , , Y,

I Br C*,

c1 a

28.

From eq 61.

Group contri- Re1 biol act. in % butions, based (based on Y, = on u g r = 0.OW Y, = I, 100%)

0.176 0.00 - 0.563 ca. - 1 . 0 ~

100 44 3 0.4

* Estimated from compounds 22 and

12 were excluded for the reasons discussed above; compound 28 was excluded to avoid a single point determination.

log A = 1.699 (k0.336) 71, - 2.059 (k0.702) uX + ( t = 10.59) ( t = 6.14) 1.713 (k0.388) E,' + 0.234 (k0.201) [I] ( t = 9.25) ( t = 2.44) 0.532 (k0.261) [CH,] -1.792 (60) ( t = 4.26) n = 25; r = 0.943; s = 0.349; F = 3 0 . 3 7 ; ~< 0.001

I1

interactions between the 2'-methyl group and the aromatic ring bearing the Y substituents. While a 2',3'-dimethyl3,5-diiodo-~~-thyronine (Xi = H X2 = CH3) with a distally oriented 3'-methyl group shows 50% the activity of Lthyroxine, the isomeric 2',5'-dimethyl-3,5-diiodo-~~- If compounds 14 and 21 are excluded, eq 61 with similar thyronine (Xi = CH3; X2 = H) with a proximally oriented logA = 1.569 (k0.251) n, - 1.582 (20.555) uX + 5'-methyl group shows only 1% the activity of L-thyroxine. ( t = 6.02) ( t = 13.21) This steric hindrance for the second X substituent is completely confirmed by eq 52 (see ref 8). 1.493 (k0.299) E,' + 0.176 (20.159) [I] Therefore i r x , ox, and Es,xlwere used to derive eq 56 for ( t = 2.34) ( t = 10.54) 0.563 (20.195) [CH,] - 1.348 (61) logA = 1.908 (k0.517) n , - 2.151 (k1.517) U, + ( t = 6.10) 1.871 (k0.700) E , S I - 1.598 (56) n = 23; r = 0.965; s = 0.250; F = 4 6 . 5 3 ; ~< 0.001 n = 13; r = 0.946; s = 0.347; F = 2 5 . 5 9 ; ~< 0.001 compounds 1-8 and 13-17. Because the coefficients of the Es terms are nearly identical in eq 55 and 56 both E, values were combined according to eq 57 (the numerical equiv-

E,'

= E,,

,

+ E,S2Corr

(57)

alence of the coefficients may be taken as evidence for the biological significance of the E s , x P ~term ~ in eq 55). E,' can be interpreted as a quantitative measure for the steric hindrance of Xi substituents and of X2 substituents larger than iodine. Using x x , ox, and Es' eq 58 was derived for

+

log A = 1.787 (k0.388) n , - 1.787 (50.863) U, 1.858 (k0.552) E,' - 1.551 (58) n = 15; r = 0.953; s = 0.329; F = 3 6 . 1 9 ; ~< 0,001 compounds 1-10 and 13-17. If compound 14 is excluded, a similar equation with a somewhat higher correlation coefficient is obtained for compounds 1-10,13, and 15-17. log A = 1.704 (k0.271) n , - 1.398 (k0.636) U, + 1.709 (20.389) E,' - 1.319 (59) n = 14; r = 0.975; s = 0.223; F = 6 5 . 5 5 ; ~< 0.001 A comparison of observed log A values and log A values calculated from eq 59 is given in Table XIV. To establish quantitative structureactivity correlations for all compounds of Table XI1 the mixed approach (eq 8) with Hansch parameters m,ox, and Es' for substituents Xi and X2 and Free-Wilson parameters [I] and [CH3] (group contributions of I and CH3, based on aBr = 0.00) for substituents Yi and Y2 was used; compounds 11 and

coefficients but a somewhat higher correlation coefficient is obtained for compounds 1-10, 13, 15-20, and 22-27. All parameters of eq 60 and 61 are statistically significant (the t values are given in parentheses under each term); a certain degree of collinearity exists between x x and Es' (r2 = 0.471) but no way could be found to circumvent this problem. The correlation between observed and calculated log A values (see Table XIV) is good if one considers that most biological activities are mean values from different biological test models and different investigators, s u m m e d up over a period of more than 30 years of research in this field. From eq 60 and 61 the following conclusions can be drawn. (a) Biological activity increases with lipophilic character; no parabolic dependence of log A on lipophilic character can be seen. (b) Electron-releasing properties of a substituent increase the biological activity, electron-attracting properties decrease the biological activity. (c) No steric hindrance can be seen for 3'substituents (X2 substituents) as far as bromine, iodine, methyl, ethyl, or smaller substituents are concerned; 3'substituents larger than iodine and all 5'-substituents (Xi substituents) lead to a decrease of biological activity due to steric hindrance; this corresponds to a high specificity of the thyroid hormone receptor for the natural hormone 3',3,5triiodothyronine. Only for X2 = isopropyl the greater +T and -(r effects surpass the deactivating effect due to steric hindrance; therefore the 3'4sopropyl compound is even more active than the natural hormone; for X2 = isobutyl, tert-butyl, or larger substituents the steric hindrance surpasses all other effects (d) Compounds with Y i = Yz = I are the most active analogues; activity decreases in the order I > Br > CH3 > C1 (see Table XV).

Quantitatiue S A Relationships. 2

Compounds with Y = H, F, i-Pr, s-Bu, CN, COOH, NO2, “2, SCzHs, or SCsH5 are inactive;431% that means that large lipophilic substituents like bromine or iodine lead to biologically active compounds, while smaller substituents or substituents larger than iodine or polar substituents cause a significant reduction or loss of activity. While the importance of bulky substituents like bromine or iodine for a favorable conformation is well understood from MO calculations,4f~~57 the most reasonable explanation for the inactivity of compounds with Y = i-Pr or s-Bu seems to be a steric hindrance which prevents binding to a receptor molecule.45 This specific structure-activity relationship would be best described by a + . ~ ~ / - E , , y l +E,,y2 correlation but not enough appropriate data are available to derive such a correlation. Discussion The two most frequently used models in quantitative structure-activity relationships, Hansch’s linear multiple regression model and Free-Wilson’s additive model, have been combined in eq 7,8, and 9 to a mixed approach, based on the equivalences of both models. The preceding examples have been discussed in detail to demonstrate the use and the advantages of the mixed approach (a) to check the fitting of a particular Hansch equation, (b) to derive significant Hansch equations, (c) to detect parabolic dependence of log 1/C values on lipophilic character also in Free-Wilson analysis, (d) to apply Free-Wilson analysis in such cases, (e) to describe structure-activity relationships for compounds with great structural variation in one position and little structural variation in another position of the molecule with the smallest number of variables possible, and (f) to describe structure-activity relationships in cases where strong collinearity of structural parameters renders the derivation of meaningful Hansch correlations impossible. In appropriate cases the mixed approach can be used as a tool to derive statistically significant and biologically meaningful correlations; some care should be taken in the right selection of Free-Wilson parameters in order not to generate meaningless mixtures of structural parameters and dummy variables (e.g., for different substituents in only one position of a molecule); a minimum number of compounds per variable should be present to avoid chance correlations.39 In every case the final equation should be checked with the criteria given by Unger and Hansch58 (no t test or sequential F test should be applied to the Free-Wilson parameters of a mixed equation because also a group contribution of zero is meaningful despite its t value). If these points are taken into account, the mixed approach widens the scope and diminishes the limitations of the linear free energy related Hansch model and the mathematical Free-Wilson model; the mixed approach “incorporates the best aspects of the two quantitative approaches” as Hansch22 has said in his definition of an equation including structural parameters and FreeWilson-like dummy variables. Acknowledgment. I am grateful to 0. H. Kehrhahn for helpful discussions and for computational assistance. References and Notes (1) C. Hansch and T. Fujita, J. Am. Chem. Soc., 86,1616 (1964). (2) C. Hansch, Acc. Chem. Res., 2, 232 (1969). (3) C. Hansch in “Drug Design”, Vol. I, E. J. Ariens, Ed., Academic Press, New York, N.Y., 1971, p 271. (4) A. Cammarata and S. J. Yau, J. Med. Chem., 13,93 (1970). (5) T. Fujita and T. Ban, J . Med. Chem., 14, 148 (1971). (6) S. M. Free, Jr., and J. W. Wilson, J . Med. Chem., 7, 395 (1964).

Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5 599 (7) A. Cammarata, J. Med. Chem., 15, 573 (1972). (8) Paper 1: H. Kubinyi and 0. H. Kehrhahn, J. Med. Chem., preceding paper in this issue. (9) C. Hansch and J. M. Clayton, J. Pharm. Sci., 6 2 , l (1973). (10) J. A. Singer and W. P. Purcell, J . Med. Chem., 10, loo0 (1967) (11) K. Bocek, J. Kopeck?, M. Krivucovi, and D. Vlachova, Exnerientia. 20. 667 (1964). (12) J. Kopeckjr, K. Bocek,‘and D. Vlachova, Nature (London), 207, 981 (1965). (13) A. &a, C.’H&h, and D. Ellrins,Chem. Reu., 71,525 (1971). (14) T. Fujita, J. Isawa, and C. Hansch, J. Am. Chem. SOC.,86, 5175 (1964). (15) C. Hansch, A. Leo, S. H. Unger, K. H, Kim, D. Nikaitani, and E. J. Lien, J. Med. Chem., 16, 1207 (1973). (16) C. Hansch, A. Leo, and D. Nikaitani, J. Org. Chem., 37,3090 (1972). (17) A. Canas-Rodriguez and M. S. Tute, Adu. Chem. Ser., No. 114, 41 (1972). (18) P. N. Craig, Adu. Chem. Ser., No. 114, 115 (1972). (19) P. N. Craig and C. Hansch, J. Med. Chem., 16,661 (1973). (20) T. Ban and T. Fujita, J. Med. Chem., 12, 353 (1969). (21) P. N. Craig, J. Med. Chem., 15, 144 (1972). (22) C. Hansch and M. Yoshimoto, J . Med. Chem., 17, 1160 (1974). (23) C. Hansch and E. J. Lien, Biochem. Pharrnacol., 17, 709 (1968). (24) Y. C. Martin, J. Med. Chem., 13, 145 (1970). (25) C. Hansch, J. Org. Chem., 35, 620 (1970). (26) J. W. Mc Farland, Prog. Drug Res., 15, 123 (1971). (27) Y. C. Martin and K. R. Lynn, J . Med. Chem., 14, 1162 (1971). (28) C. Hansch and W. R. Glave, J. Med. Chem., 15,112 (1972). (29) Y. C. Martin, P. H. Jones, T. J. Perun, W. E. Grundy, S. Bell, R. R. Bower, and N. L. Shipkowitz, J . Med. Chem., 15, 635 (1972). (30) T. Fujita, J. Med. Chem., 16, 923 (1973). (31) Y. C. Martin, J. B. Holland, C. H. Jarboe, and N. Plotnikoff, J . Med. Chem., 17,409 (1974). (32) C. Silipo and C. Hansch, Mol. Pharrnacol., 10,954 (1974). (33) J. K. Seydel, H. Ahrens, and W. Losert, J. Med. Chem., 18, 234 (1975). (34) F. R. Quinn, J. S. Driscoll, and C. Hansch, J. Med. Chem., 18, 332 (1975). (35) R. Franke and P. Oehme, Pharmazie, 28, 489 (1973). (36) E. J. Lien, C. Hansch, and S. M. Anderson, J. Med. Chem., 11, 430 (1968). (37) C. Hansch and E. J. Lien, J . Med. Chem., 14, 653 (1971). (38) E. Kutter and C. Hansch, J . Med. Chem., 12,647 (1969). (39) J. G. Topliss and R. J. Costello, J . Med. Chem., 15, 1066 (1972). (40) M. S. Tute, Adu. Drug. Res., 6, 1 (1971). (41) E. C. Jorgensen, P. A. Lehman, C. Greenberg, and N. Zenker, J . Biol. Chem., 237, 3832 (1962). (42) C. M. Greenberg, B. Blank, F. R. Pfeiffer, and J. F. Pads, Am. J. Physiol., 205, 821 (1963). (43) E. C. Jorgensen in “Medicinal Chemistry”, Vol. II,A. Burger, Ed., Wiley-Interscience, New York, N.Y., 1970, p 838. (44) P. A. Lehmann, J . Med. Chem., 15,404 (1972). (45) P. A. Kollman, W. J. Murray, M. E. Nuss, E. C. Jorgensen, and S. Rothenberg, J . Am. Chem. SOC.,95, 8518 (1973). (46) E. C. Jorgensen, W. J. Murray, and P. Block, Jr., J . Med. Chem., 17, 434 (1974). (47) T. C. Bruice, N. Kharasch, and R. J. Winzler, Arch. Biochem. Biophys., 62, 305 (1956). (48) G. Cilento and M. Berenholc, Biochem. Biophys. Acta, 94, 271 (1965). (49) C. Hansch, A. R. Steward, J. Isawa, and E. W. Deutsch, Mol. Pharrnacol., 1, 205 (1965). (50) E. C. Jorgensen and J. R. Nulu, J. Pharm. Sci., 58, 1139 (1969). (51) E. C. Jorgensen and P. Block, Jr., J . Med. Chem., 16,306 (1973). (52) E. C. Jorgensen and R. A. Wiley, J . Med. Pharm. Chem., 5, 1307 (1962).

600 Journal of Medicinal Chemistry, 1976, Vol. 19, No. 5

(53) H. H. Jaffi, Chem. Rev., 53, 191 (1953). (54) R. W. Taft, Jr., in “Steric Effects in Organic Chemistry”, M. S. Newman, Ed., Wiley, New York, N.Y., 1956, p 556. (55) D. H. McDaniel and H. C. Brown, J. Org. Chem., 23,420 (1958).

Johnson

(56) E. C. Jorgensen, R. 0. Muhlhauser, and R. A. Wiley, J. Med. Chem., 12, 689 (1969). (57) L. B. Kier and J. R. Hoyland, J. Med. Chem., 13, 1182 (1970). (58) S. H. Unger and C. Hansch, J . Med. Chem., 16,745 (1973).

Quantitative Structure-Activity Studies on Monoamine Oxidase Inhibitors Carl Lynn Johnson Department of Pharmacology, Mount Sinai School of Medicine, City University of New York, New York, New York 10029. Receiued June 5, 1975 Quantitative structure-activity studies were carried out on a series of N-isopropylaryl hydrazides which inhibits monoamine oxidase (MAO). The inhibitory potencies of these compounds on MA0 were found to correlate with the electron-withdrawing capacity of the aryl ring substituents as estimated by both empirical Hammett u constants and electronic indices from molecular orbital calculations. Based on these correlations and previously published data on other classes of MA0 inhibitors, a general model for the inhibitor pharmacophore is proposed: potent MA0 inhibitors should contain an electron-rich functional group in the plane of and approximately 5.3 A from the center of an aromatic ring; electron-withdrawing groups on the aromatic ring or replacing the phenyl ring with certain types of heterocyclic rings will tend to increase the potency.

The importance of monoamine oxidase (MAO) in controlling the levels of neurotransmitters has prompted interest in learning the mechanism of substrate oxidation and the nature of the active site as well as in understanding the molecular mechanisms of MA0 inhibition by a wide variety of chemical compounds, all of which exhibit a pharmacological profile including potent antidepressant activity. Some of the known MA0 inhibitorslv2 include arylalkylhydrazines,aryl hydrazides, arylpropargylamines, arylcyclopropylamines, aryloxycyclopropylamines, Ncyclopropylaryloxyethylamines, 0-carbolines, and amethylated arylalkylamines. A common structural feature of both substrates and these various classes of inhibitors is an amino or imino group, which is assumed to play an essential role in complex formation at the active site. The aryl portion of the MA0 inhibitors is not an absolute requirement for inhibition of the enzyme, but substituents on the aryl ring or modification of the type of ring markedly affect potency. One approach to understanding the role of the aryl portion of these compounds in MA0 inhibition and to gain insight into the relationships between chemical structure and pharmacological activity is to carry out quantitative structure-activity relationships (QSAR)3 on a series of related molecules. This approach requires, in addition to the estimate of the biological activity of the molecules, a set of parameters-either empirically or theoretically derived-which describes in numerical terms the structural characteristics of the compounds. Several QSAR studies have dealt with both substrates4 and inhibitors57 of MAO. The study of Fujita7 is especially extensive, comparing the QSAR results for many classes of inhibitors. One group of MA0 inhibitors not previously considered in a QSAR study is the aryl hydrazides. In the present communication we report on the QSAR for a series of hydrazides and compare the results with those for other classes of MA0 inhibitors in an attempt to arrive at a general picture of the inhibitory pharmacophore. A preliminary report of some of these results has appeared.8 Methods. Molecular orbital (MO) calculations were carried out by the CND0/2 method9 with the CNINDO computer program.10 Coordinates were generated with the MBLD programll from crystallographic data and/or standard bond angles and bond lengths.12 Calculations were performed on a CDC 6600 computer. Multiple re-

gression analysis was carried out with a stepwise regression program adapted for the PROPHET13 time-sharing system. Hammett u constants were taken from McDaniel and Brown14 and odanol-H20 partition coefficients were taken from the compilations of Hansch and co-workers.15J6 Results and Discussion Empirical Structure-Activity Relationships. The in vivo potencies of a series of 23 aryl hydrazides of the general structure ArCONHNHCH(CH3)2 have been published17 and are listed in Table I. The potencies are reported as the Marsilid Index (M.I.) taking iproniazid (Cpyridyl derivative) as the standard (M.I. = 100). The Marsilid Index, designed to measure the relative degree of inhibition of brain MAO, is defined as the ratio of the increase in serotonin in rat brain produced by a substance (in amount equimolar to 100 mg of iproniazid/kg) to the increase in serotonin produced by 100 mg of iproniazid/kg. The electronic substituent constant, u, and the hydrophobic or lipid solubility constant, log P, are listed in Table I. Since Hammett u constants are available only for the phenyl derivatives, the empirical QSAR studies were restricted to these ten compounds. Multiple regression analysis led to eq 1-7. The numbers in parentheses in the regression equations represent the standard errors of the regression coefficients; n is the number of compounds; r is the multiple correlation coefficient; s is the standard deviation of the regression; and F is the F test for the significance of the regression. A significant correlation (p < 0.01) of the Marsilid Index and u values was obtained (eq 1). The correlation with log P was much poorer (eq M.I. = 88.25 (k23.09) u + 83.37 n = 10; r = 0.804; s = 31.4; F = 14.6 2).

Including a log P term with

u

(1)

did not improve the

M.I. = 32.80 (t13.74) log P + 0.64 rz = 10; r = 0.646; s = 30.3; F = 5.7

(2)

correlation (eq 3). The major deviations from the above M.I. = 70.51 (524.41) u + 17.13 (511.30) l o g P + 44.33 n = 10; r = 0.856; s = 29.1; F = 9.6

(3)