Discrimination and Molecular Design of New Theoretical

interpretation of some pharmacological properties of hypolipaemic drugs using multivariable regression equations with their statistical parameters. Re...
5 downloads 11 Views 163KB Size
178

J. Chem. Inf. Comput. Sci. 2000, 40, 178-184

Discrimination and Molecular Design of New Theoretical Hypolipaemic Agents Using the Molecular Connectivity Functions Rosa Ana Cerco´s-del-Pozo, Facundo Pe´rez-Gime´nez,* M. Teresa Salabert-Salvador, and Francisco J. Garcia-March Department of Physical Chemistry, Faculty of Pharmacy, Universitat de Vale`ncia, Av. Vicent Andre´s Estelle´s, s/n. 46100 Burjassot, Valencia, Spain Received May 24, 1999

The molecular topology model and discriminant analysis have been applied to the prediction and QSAR interpretation of some pharmacological properties of hypolipaemic drugs using multivariable regression equations with their statistical parameters. Regression analysis showed that the molecular topology model predicts these properties. The corresponding stability (cross-validation) studies done on the selected prediction models confirmed the goodness of the fits. The method used for hypolipaemic activity selection was a linear discriminant analysis (LDA). We make use of the pharmacological distribution diagrams (PDDs) as a visualizing technique for the identification and design of new hypolipaemic agents. The QSAR (quantitative structure-activity relationship) methods examine the relationship between a determined property and the chemical structure of a series of molecules,1 thus making it possible to observe the variation in this property in the analyzed series. This information can then be used to predict the values of this property in other molecules of the same therapeutic group. Molecular connectivity is a topological method able to describe the structure of a molecule by means of numbers called indices (mχt), calculated from the hydrogen-suppressed graphs of the molecule being studied. These indices subsequently regress in relation to the experimental values of the physical, chemical,2 and/or biological3,4 properties to obtain the so-called connectivity functions.5 We chose a group of hypolipaemic drugs (drugs that decrease the lipid concentration in plasma) to study QSAR relationships and the corresponding connectivity functions, demonstrate the predictive capacities of both, and obtain a discriminant function that facilitates the design and molecular selection of new structures with theoretical hypolipaemic activity.6 MATERIALS AND METHODS

In the connectivity method, molecular structure is expressed topologically by the hydrogen-suppressed graph. It should be noted that information concerning the contributions of the hydrogen atoms is implicit in this graphical formulation. For alkanes there is a direct relation between the vertex valence δi and the number of hydrogen atoms (H) implied at vertex i, δi ) 4 - H. The number 4 may represent the valence or the number of valence electrons for the carbon atom. The mχt are terms defined by a subgraph of type t containing m edges, connected in the graph. Disconnected subgraphs are not considered. The order of the subgraph is

Figure 1. Subgraphs and connectivity indices of isopentane.

defined as m. Subgraphs may be classified into four types: (1) path (p), subgraphs whose subgraph valences are no greater than 2; (2) cluster (c), subgraphs whose subgraph valences include at least one of 3 or 4 but do not include 2; (3) path-cluster (pc), subgraphs whose subgraph valences must include 2 in addition to 3 and/or 4; (4) chain (ch), edge sequences containing at least one cycle. The connectivity indices (mχt) proposed by Kier and Hall7 and based on the Randic index8 are evaluated as a sum of terms over all the distinct connected subgraphs and are defined by the general equation nm

χt ) ∑mSj

m

(1)

j)1

where m ) order of a subgraph, i.e., number of edges of a subgraph, nm ) number of type t subgraphs of order m, and

10.1021/ci9900480 CCC: $19.00 © 2000 American Chemical Society Published on Web 11/19/1999

NEW THEORETICAL HYPOLIPAEMIC AGENTS

J. Chem. Inf. Comput. Sci., Vol. 40, No. 1, 2000 179

Table 1. Experimental Values of Several Pharmacological Properties of a Group of Hypolipaemic Drugs Used in the Connectivity Functions compd acifran benfluorex bezafibrate ciprofibrate clinofibrate clofibrate doxazosin fenofibrate gemfibrozil lovastatin niceritrol nicotinic acid pantethine pirifibrate plafibride pravastatin probucol simfibrate simvastatin tiadenol mS

j

EGC (%)

RTC (%)

PPB (%)

Tmax (h)

88.0 14.00 18.00 21.00 11.00 15.00 8.00 20.00

22.00 73.00 83.00 50.00

27.00 16.00 10.00 21.00

10.00

1.50 1.50 1.75 2.00 5.00 4.00 3.00 5.00 1.50 4.00

95.0 69.5 96.4 98.5 99.0 96.0 95.0

16.00 24.00

55.0 5.0

25.00 25.10

95.0

1.25 24.00 6.00 1.85

LD-50m (g/kg)

A multiple regression analysis was used to find the relationship between the pharmacological properties of hypolipaemics drugs and the connectivity indices. It can be expressed as7 n

C(χ) ) P ) A0 + ∑Ai‚χi

(4)

i)1

2.46 5.00 3.16

10.00 1.25 3.78 9.76 5.00 3.40

) quantity calculated for each subgraph and defined by m+1

Sj ) [ ∏ (δi)]-1/2

m

(2)

i)1

where j denotes the particular set of edges that constitute the subgraph. All the distinct subgraphs and connectivity indices for isopentane are shown in Figure 1. The vertex valences, δv, of the unsaturated carbon atoms and the heteroatoms (N, S, O) can be calculated using

δ v ) Zv - H

(3)

where Zv is the number of valence electrons of the atom and H is the number of hydrogen atoms attached to it.4,8 The empirically derived values for the halogens were also used.5 We used 16 mχt valence and nonvalence indices in our study, ranging from zero to fourth order, whose types are path, cluster, and path-cluster.

where P is a property and Ao and Ai represent the regression coefficients of the obtained equation. Once the connectivity function (eq 4) is established, its value for a specific molecule may be predicted. The connectivity indices that were used in this study were calculated using eqs 1-3 and computer software developed in our department.9 The connectivity function (eq 4) was obtained by multilinear regression with the BMDP 9R program of the biostatistics package BMDP (Biomedical Computer Programs).10 To test the quality of the regression equations, the following statistical parameters were used: multiple correlation coefficient (r), standard deviation (sd), F-Snedecor function values (F), Mallow’s CP, and Student’s t-test (statistical significance), as well as the corresponding cross validation of the selected functions. Cross-validation for the selected functions was carried out using the jackknife method.11 With this method, n observations are eliminated by means of a random process, and a regression program is applied. The process was repeated as many times as necessary until all the observations were eliminated at least once and at most four times. Finally, the coefficients of the independent variables calculated, the correlation coefficients, standard deviations, and the residuals are compared with those obtained in the selected equation. The following pharmacological properties were investigated: (1) elimination as conjugated glucuronide (%), (2) reduction in total cholesterol (%), (3) percentage of plasmatic protein binding (PPB), (4) maximum plasmatic concentration time (Tmax) (h), (5) lethal dose 50 p.o. in mouse (LD-50m) (g‚kg-1). The experimental values for these properties were obtained from different bibliographic sources12-20 (Table 1). Linear discriminant analysis (LDA) was used to select the parameters that identify the active or inactive character of the molecules, with the help of the BMDP 7M program,

Table 2. Connectivity Index Values of a Group of Hypolipaemic Agents Used in the Correlation Equations compd



0χv

1χv



2χv

3χ v p



acifran benfluorex bezafibrate ciprofibrate clinofibrate clofibrate doxazosin fenofibrate gemfibrozil lovastatin niceritrol nicotinic acid pantethine pirifibrate plafibride pravastatin probucol simfibrate simvastatin tiadenol

10.085 16.516 16.361 12.508 22.897 11.138 20.922 16.524 12.767 19.922 24.532 5.594 26.297 15.154 16.165 21.215 26.224 20.983 20.844 13.314

8.636 13.845 14.909 11.765 20.596 10.445 18.484 15.533 11.617 18.303 21.621 4.612 22.658 13.779 14.629 18.107 25.714 19.597 19.096 12.804

4.778 7.977 8.242 6.450 12.143 5.485 10.686 8.406 6.262 11.479 12.243 2.438 13.38 7.493 8.119 10.747 13.767 10.557 11.353 8.975

4.850 8.475 7.980 7.328 11.779 5.400 10.719 8.304 6.301 11.162 11.378 2.146 13.272 7.323 8.619 10.675 16.150 10.860 11.307 5.950

3.679 5.759 6.551 6.718 9.675 4.263 7.810 6.904 5.410 9.636 8.771 1.547 11.314 5.898 6.392 8.460 16.941 8.712 9.831 6.291

2.582 3.634 3.892 2.518 7.461 2.229 5.877 3.706 3.126 7.320 5.627 0.908 6.757 3.331 3.803 5.932 7.834 4.662 6.846 4.331

1.031 1.948 1.744 2.500 2.640 1.367 1.355 2.030 1.547 1.941 1.540 0.260 3.032 1.602 1.747 1.932 6.981 2.733 2.508 0.000

c

4χ v c

0.005 0.013 0.102 0.102 0.207 0.102 0.000 0.102 0.177 0.000 0.125 0.000 0.408 0.102 0.102 0.000 1.505 0.204 0.177 0.000

CERCOÄ S-DEL-POZO

180 J. Chem. Inf. Comput. Sci., Vol. 40, No. 1, 2000

ET AL.

Table 3. Connectivity Function Relationship Obtained by Multilinear Regresion for Several Pharmacological Propertiesa property

equation

n

r

sd

F

CP

p

elimination as glucuronide conjugated (%) reduction in total cholesterol (%)

EGC ) -(35.36 ( χ + (161.45 ( 2.95) RTC ) (0.27 ( 0.16)0χv - (28.20 ( 6.80)2χ/2χv + (48.11 ( 9.26) PPB ) -(26.75 ( 4.48)0χ + (32.00 ( 5.27)0χv (32.45 ( 3.16)3χc + (111.10 ( 7.47) Tmax ) (14.67 ( 1.16)4χcv + (1.58 ( 0.48) LD-50m ) -(2.57 ( 0.58)0χv + (5.07 ( 0.99)1χv + (0.80 ( 1.58)

5 15

0.999 0.829

1.58 3.56

1581 13.16

1.17 4.00