A Nonlinear Map of Substituent Constants for Selecting Test Series

Apr 1, 1994 - A Nonlinear Map of Substituent Constants for Selecting Test Series and Deriving Structure-Activity Relationships. II. Aliphatic Series...
0 downloads 0 Views 792KB Size
J.Med. Chem. 1994,37, 981-987

981

A Nonlinear Map of Substituent Constants for Selecting Test Series and Deriving Structure-Activity Relationships. 2. Aliphatic Series Daniel Domine,fJ James Devillers,*J and Maurice Chastrettet CTZS,21 rue de la Bannikre, 69003 Lyon, and Laboratoire de Chimie Organique Physique, U.R.A. CNRS 463, Universitd Lyon-I, 43 Bd du 11 Novembre 1918, 69622 Villeurbanne CEDEX, France Received August 23, 199P

A nonlinear mapping (NLM) analysis was performed on a set of 103aliphatic substituents described by five variables encoding hydrophobic (Fr), steric (MR), and electronic effects (HBA, HBD, and F). NLM allowed to easily summarize the main information contained in the original data table. By means of collections of graphs, it was possible to relate the structure of the aliphatic substituents to their Fr, MR, HBA, HBD, and F values. The proposed approach provides a useful and easy tool for the selection of test series and for deriving structure-activity relationships.

Introduction individuals/variables Quantitative structure-activity relationship (QSAR) studies are based on the modification of the structure of known series of active chemicals and the modeling of the effects of these changes on their biological activities.lI2It is well known that the physicochemical properties of moleculesare the basic parameters to be taken into account in these kinds of studies, and as a result, numerous data compilationsof substituent constants have been elaborated Initialization of coordinates for use in QSAR studies.2-5 Among them, the most widely and distance matrix used are the T contribution of Hansch which depicts the dimension d lipophilic character of the substituents, the Hammett u constants which are used to account for electronic processes,the Swain and Lupton Fand R parameters derived from the u constants separating the inductive and resonance effects of the substituents, and the molar refractivity (MR) used to describe the steric bulk of substituents.611 In drug design, it is essential to select test series with high information content in order to reduce the costs in research by maximizing the information content obtained Coordinates modification from each molecular probe in a set of congeners. A lot of works have been directed toward this aim, and numerous methods have been proposed.l”28 Significant advances dealt with the use of a multivariate method14and later of a graphical representation of the data allowing selection Nonlinear map of test series by simple visual inspection of a 2-D map (dimension d) summarizing the information content of a matrix of physicochemical properties.2b22*28For a comprehensive Figure 1. NLM algorithm flow diagram. review, one should refer to the paper of Pleiss and U n g e ~ ~ ~ Even if these methods have been successfullyused, it must and Bende~3~133It is based on a concept similar to be pointed out that, from a practical point of view, none multidimensional scaling (MDS)and is aimed at of these approaches are completely satisfa~tory.~~ In this representing a set of points ddined in an n-dimensional context, we recently proposed the use of an original space by a human-perceivable configuration of the data graphical approach based on the nonlinear mapping in a lower d-dimensional space (d = 2 or 3). NLM tries (NLM)method.30 We showed that it was possible to obtain to preserve distances between points in the display space an easily interpretable nonlinear map of aromatic subas similar as possible to the actual distances in the original stituent constants for the selection of test series and the space. The procedure for performing this transformation derivation of structure-activity relationships (SAR).The is summarized in Figure 1. Briefly,it consists in calculating good results obtained prompted us to perform the same a mapping error (E) between the distances in the original kind of analysis on a set of aliphatic substituents. space and the distances in the display space (Le.,nonlinear map). This error is used to modify the coordinates of Nonlinear Mapping points in the display space. This process is carried out The nonlinear mapping (NLM) method was designed iteratively by means of a minimization algorithm called by Sammon3l and introduced in chemistry by Kowalski ‘steepest descent procedure”until termination conditions * Author to whom all correspondence should be addressed. are satisfied. The most widely used termination condition t CTIS. is a sufficientlylow difference between the error calculated t UniversiG Lyon-I. 0 Abstract published in Advance A C S Abstracts, March 1, 1994. at step n and step n-1 in the iteration process. 0022-2623/94/1837-0981$04.50/00 1994 American Chemical Society

982 Journal of Medicinal Chemistry, 1994, Vol. 37, No.7

0

0

Domine et al.

0

ldoo

1W 98 95 94

11

86 66 67

63

30

5391 92

88

45

97

19

44 1

83

17

28 31

62 5

43 13

l7?

26

18 34

85

61 42

16 15 55

39

21

13 74

;; 2o

1

80

53 64

54

46

25

9

29 23

co;

I

1

15

Figure 2. (2.1) Nonlinear map of the 103 aliphatic substituents described by five substituent constants (Fr, HBA, HBD, MR, and F). (2.2) Plot of the individual mapping errors on each substituent of the nonlinear map. Squares are proportional in size to the magnitude of the errors. 1,Br; 2, C1; 3, F 4, I; 5, NO2; 6, H; 7, OH; 8, SH; 9, NH2; 10, CBrs; 11,eels; 12, CF3; 13, CN; 14, SCN; 15, Cog-; 16, COzH, 17,CHZBr; 18, CH&b 19, CH2I; 20, CONH2; 21, CH=NOH; 22, CH3; 23, NHCONH2; 24, OCHs; 25, CH20H; 26, SOCHs; 31: C 4 H ; 32, CHZCN33, CH=CHN02 (trans); 34, CH=CH2; 35, COCH,; 36, 27,OS02CHs; 28, SCHs; 29, NHCH,; 30, CFZCF~; OCOCHs; 37, C02CHs; 38, NHCOCHs; 39, C=O(NHCHs); 40, CH2CHs; 41, OCH2CH3; 42, CHzOCHs; 43, SOC2Ha; 44, SC2Ha; 45, CH2Si(CH&; 46, NHC2Hs; 47, N(CH&; 48, CH=CHCN; 49, cyclopropyl; 50, COC2Hs; 51, c02C~Hs;52, OCOCzH,; 53, EtC02H; 54, NHCO~C~HS; 55, CONHC2Hs; 56, NHCOCzHs; 57, CH(CH& 58, C3H7; 59,OCH(CHs)2; 60,OC&; 61, CH20C2Ha; 62, SOCsH7; 63, SC3H7; 64, NHCsH7; 65, Si(CH3)a;66,a-thienyl; 67,3-thienyl; 68, CH=CHCOCHs; 69, CH-CHC02CHs; 70, COCsH7; 71, OCOCsH,; 81, 72, C02CaH7; 73, (CH2)3COzH; 74, NHCOCsH,; 75, CONHCsH,; 76, C,Hg; 77, C(CH3)s; 78, OC4He; 79, CH2OCa7; 80,NHC~HQ; N(C2H& 82, CH=CHCOC2H,; 83, CH-CHCO2CzHs; 84, C&Iii; 85, CH~OCJIH~; 86, C&; 87, OCsHs; 88, SO&&; 89, NHC&; 90, 2-benzthiazolyl;91, CH-CHCOCsH7; 92, CH==CHCO&H7; 93, COCeHa; 94, COzCsHs; 95,OCOCsHs; 96, NHCOCsHs; 97, CH2CeHs; 98, CH20C&,; 99, CH&(C2&)3; 100, CH=cHC& (trans); 101, CH