Extended Adjacency Matrix Indices and Their Applications - Journal of

Comparison of Weighting Schemes for Molecular Graph Descriptors: Application in Quantitative Structure−Retention Relationship Models for ...
2 downloads 0 Views 628KB Size
J. Chem. InJ Comput. Sci. 1994,34, 1140-1 145

1140

Extended Adjacency Matrix Indices and Their Applications Yi-Qiu Yang, Lu XU,' and Chang-Yu H u Applied Spectroscopy Laboratory, Changchun Institute of Applied Chemistry, Academia Sinica, Changchun 130022, Jilin, People's Republic of China Received March 30, 1994'

In this paper, new topological indices, EAC and EAmax, are introduced. They are based on the extended adjacency matrices of molecules, in which the influences of factors of heteroatoms and multiple bonds were considered. The results show that EAC and EAmax possess high discriminating power and correlate well with a number of physicochemical properties and biological activities of organic compounds. INTRODUCTION The structure of a molecule-geometric and electronic-must contain the features responsible for its physical and chemical properties. The simplest way to represent a molecule's structure is to assign to the structure a number or a set of numbers, termed indices; then, the indices are applied to the study of the correlations with properties ranging from physical to biological. In recent years, one type of these methods seems to hold good promise for the quantitative structure-property relationship (QSPR) and the quantitative structure-activity relationship (QSAR) studies. The topological indices are actually graph invariants. They are obtained from the chemical graph (hydrogen-suppressed graph). Uniqueness and correlation are two of the most important requirements for the topological indices. The task of defining an index that could have different, unique, but structurally significant, values for different structures seems to be very difficult. Therefore, more than 100 topological indices are in existence, such as the Wiener index, W,'the ~ Balaban index, Hosoya index, Z,2 the RandiC index x , the 5,4and the extended RandiC index by Kier and Hall.s In this paper, the new indices, EA'S proposed by us, show lower degeneracy and good performance in correlation. METHODS Recently, RandiC discussed the strategies for searching optimal molecular de~criptors.6~~ As we know already, most indices to date have been based on two particular matrices: the distance and adjacency matrices. In this paper, the adjacency matrix and its extended form were used to deduce the new topological indices. The entries in the adjacency matrix are symbolized as av and are equal to either one or zero, depending respectively on whether or not the vertices are connected: 1 for adjacent vertices aij = 0 otherwise

For example the hydrogen-suppressed graph of 2-methylbutane is

Table 1. Electronegativities of Common Atoms H 2.1 C N 3.0 0 F 4.0 P S 2.5 CI Br 2.8 I

2.5 3.5 2.1 3.0 2.5

Table 2. Values of ID, EAC, and EAmax of 20 Structures no. 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160

ID 18.0379058 24.315 9362 15.0076026 19.2052889 19.2127979 19.3869551 26.868 1152 7.777 7778 17.991 7696 20.263 3 22.4172175 22.5248849 22.457 3998 17.6762689 22.1002019 21.417 0255 31.2107297 21.3924287 21.178 7608 19.4773775

EA1 14.798 27 19.65951 10.461 1 1 13.375 89 13.11984 14.105 44 17.44643 6.000000 12.00000 13.29879 14.285 75 14.71333 15.517 54 12.292 53 15.30758 14.49962 21.656 12 15.121 58 16.19804 12.13982

EAmax

3.337 117 3.337 117 2.936060 2.653 472 2.672 798 2.829541 2.829 542 3.000 000 3.000000 2.951932 2.914090 2.906 148 3.000000 3.OOO 000 2.914090 2.653 614 2.777381 2.620 189 3.049510 3.005782

Table 3. EA Indices for 18 Graphs no.

EAC 1 1.25963 11.49126 1 2.21042 11.551 54 11.55315 11.57474 11.42910 1 1.64761 10.68483

EAmax 2.878785 2.864037 2.822746 2.865975 2.824196 2.822746 2.822746 2.884355 2.884355

no.

10 11 12 13 14 15 16 17 18

EAz 11.65685 1 1.73486 1 1.73486 11.32856 11.32856 12.47214 12.47214

EAmax 3.000000 2.885716 2.885716 2.894902 2.894902 3 .000000 3.oooo00

12.00000

3 .OOOOOO

12.00000

3.000000

and its adjacency matrix takes the form

n

Abstract published in Aduance ACS Absrracrs, July 15, 1994.

0095-233819411634-1 140S04.50,IO I

,

We also introduce the degree vector V, V = (vi), where ui 0 1994 American Chemical Society

J. Chem. In!

EXTENDED ADJACENCY MATRIXINDICES

Comput. Sci., Vol. 34, No. 5, I994 1141

Table 4. Towlogical Indices and Boiling Points for 149 Alkanes ~

~~

compounds ethane propane butane 2-methylpropane pentane 2-methylbutane 2,2-dimethylpropane hexane 2-methylpentane 3-methylpentane 2,3-dimethylbutane 2,2-dimethylbutane heptane 2-methylhexane 3-methylhexane 3-ethylpentane 2,4-dimethylpentane 2,3-dimethylpentane 2,2-dimethylpentane 3,3-dimethylpentane 2,2,3-trimethylbutane octane 2methylheptane 3-methylheptane 4-methylheptane 3-ethylhexane 2,5-dimethylhexane 2,4-dimethylhexane 2,3-dimethylhexane 3-ethyl-2-methylpentane 3,4-dimethylhexane 2,2-dimethylhexane 3,3-dimethylhexane 3-ethyl-3-methylpentane 2,3,4-trimethylpentane 2,2,4-trimethylpentane 2,2,3-trimethylpentane 2,3,3-trimethylpentane 2,2,3,3-trimethylbutane nonane 2-methyloctane 3-methyloctane 4-methyloctane 4-ethylheptane 3-ethylheptane 2,6-dimethylheptane 2,5-dimethylheptane 2,3-dimethylheptane 2,4-dimethylheptane 3-ethyl-2-methylhexane 4-ethyl-2-methylhexane 3,5-dimethylheptane 3,4-dimethylheptane 3-ethyl-4-methylhexane 2,2-dimethylheptane 3,3-dimethylheptane 4,4-dimethylheptane 4,4-diethylpentane 3-ethyl-3-methylhexane 2,3,5-trimethylhexane

2,4-dimethyl-3-ethylpentane 2,3,4-trimethylhexane 2,2,5-trimethylhexane 2,2,4-trimethylhexane 2,2,3-trimethylhexane

2,2-dimethyl-3-ethylpentane 2,4,4-trimethylhexane 2,3,3-trimethylhexane 3,3,4-trimethylhexane 2,3-dimethyl-3-ethylpentane 2,2,3,4-tetramethylpentane 2,3,3,4-tetramethylpentane 2,2,4,4-tetramethylpentane 2,2,3,3-tetramethylpentane decane 2-methylnonane 3-methylnonane

W

Z

X

J

1 4 10 9 20 18 16 35 32 31 29 28 56 52 50 48 48 46 46 44 42 84 79 76 75 72 74 71 70 67 68 71 67 64 65 66 63 62 58 120 114 110 108 102 104 108 104 102 102 96 98 100 98 94 104 98 96 88 92 96 90 92 98 94 92 88 92 90 88 86 86 84 88 82 165 158 153

2 3 5 4 8 7 5 13 11 12 10 9 21 18 19 20 15 17 14 16 13 34 29 31 30 32 25 26 27 28 29 23 25 28 24 19 22 23 17 55 47

1.000 000 1.414214 1.914 214 1.732 05 1 2.414 214 2.270 056 2.000 000 2.914 214 2.770 056 2.808 061 2.642 735 2.560 660 3.414 214 3.270 056 3.308 061 3.346 066 3.125 898 3.180 739 3.060 660 3.121 320 2.943 376 3.914 214 3.770 056 3.808 061 3.808 061 3.846 066 3.625 898 3.663 903 3.680 739 3.718 744 3.718 744 3.560 660 3.621 320 3.681 981 3.553 418 3.416 502 3.481 381 3.504 036 3.250 000 4.414 214 4.270 056 4.308 061 4.308 061 4.346 066 4.346 066 4.125 898 4.163 903 4.180 739 4.163 903 4.218 744 4.201 908 4.201 908 4.218 744 4.256 749 4.060 660 4.121 321 4.121 321 4.242 640 4.181 981 4.036 582 4.091 423 4.091 423 3.916 502 3.954 507 3.981 381 4.019 385 3.977 163 4.004 036 4.042 041 4.064 696 3.854 059 3.886 752 3.707 107 3.810 660 4.914 214 4.770 056 4.808 061

1.ooo 000

50

49 51 52 40 43 44 41 45 44 45 46 48 37 41 39 48 44 37 39 41 32 33 35 36 34 36 39 40 31 33 24 30 89 76 81

1.632 993 1.974 745 2.323 790 2.190 610 2.539 539 3.023 716 2.339 092 2.627 215 2.754 185 2.993 498 3.168 490 2.447 473 2.678 258 2.831 820 2.992 303 2.953 223 3.144 208 3.154 490 3.360 435 3.541 197 2.530 061 2.7 15 843 2.872 066 2.919 613 3.074 373 2.927 820 3.098 828 3.180 819 3.354 877 3.292 478 3.111 766 3.373 382 3.583 212 3.464 227 3.388 924 3.623 28 1 3.708 324 4.020 391 2.595 083 2.746 691 2.876 623 2.954 823 3.175 341 3.092 246 2.914 659 3.060 821 3.155 280 3.151 251 3.410 085 3.307 394 3.223 047 3.324 760 3.499 480 3.072 990 3.330 074 3.431 051 3.824 684 3.617 390 3.376 601 3.677 616 3.575 834 3.280 71 1 3.467 262 3.588 734 3.792 908 3.576 752 3.702 086 3.802 396 3.919 211 3.877 605 4.013 737 3.746 418 4.144 726 2.647 605 2.773 189 2.886 163

EAE 2.000 000 3.535 534 5.385 165 5.773 502 6.274 918 7.532 390 8.500 001 7.875 304 8.320 846 9.089 133 9.637 889 10.173 18 8.892 971 9.992 3 15 9.929 288 9.509 251 10.336 42 11.16928 10.929 39 11.715 59 12.253 66 10.396 22 10.968 48 11.564 35 10.752 93 11.045 75 12.116 62 11.960 89 12.014 14 11.470 13 12.707 51 12.621 92 12.491 21 13.019 51 13.244 01 12.934 07 13.776 70 13.777 61 14.857 66 11.477 15 12.504 77 12.561 53 12.400 64 12.048 32 12.103 02 13.035 63 13.685 03 13.647 04 12.779 65 13.072 26 13.155 79 13.576 83 13.551 02 13.045 09 13.586 60 14.171 24 13.262 69 13.090 17 13.840 58 14.047 27 13.398 14 14.783 69 14.748 63 14.564 33 14.622 88 14.046 11 14.502 35 14.554 98 15.303 75 15.050 23 15.849 47 15.835 85 15.527 46 16.375 59 12.928 00 13.563 13 14.081 01

EAmax 1.ooo 000 1.767 767 1.846 291 2.886 751 1.887 459 2.657 551 4.250 000 1.912 646 2.644 6 17 2.441 140 2.909 472 3.909 893 1.929 549 2.641 847 2.428 892 2.254 625 2.811 188 2.786 288 3.902 909 3.549 463 3.905 577 1.941 614 2.641 261 2.425 542 2.416 806 2.246 364 2.722 244 2.724 521 2.780 567 2.693 850 2.631 516 3.902 374 3.540 540 3.172 473 2.919 351 3.914 430 3.877 758 3.568 858 4.214 415 1.950 614 2.641 138 2.424 473 2.413 346 2.238 559 2.243 288 2.679 947 2.664 994 2.779 527 2.721 230 2.689 473 2.680 721 2.580 399 2.623 622 2.512 713 3.902 333 3.539 675 3.531 518 2.795 085 3.161 112 2.848 745 2.828 666 2.842 428 3.903 268 3.910 517 3.877 049 3.855 443 3.561 424 3.560 822 3.527 114 3.239 333 3.882 490 3.585 167 4.083 121 4.054 641 1.957 555 2.641 112 2.424 202

bP ("C -88.6 -42.1 -0.50 -1 1.73 36.07 27.85 9.50 68.74 60.27 63.28 57.99 49.74 98.42 90.05 91.85 93.48 80.50 89.78 79.20 86.03 80.88 125.67 117.65 118.93 117.71 118.54 109.10 109.43 115.61 115.65 117.72 106.84 111.97 118.25 113.46 99.23 109.84 114.76 106.47 150.8 142.8 143.5 142.4 141.2 143.0 135.2 136.0 140.5 133.5 138.0 133.8 136.0 140.1 140.4 132.7 137.3 135.2 146.2 140.6 131.3 136.73 139.0 124.0 126.5 131.7 133.83 126.5 137.7 140.5 141.6 133.0 141.5 122.7 140.3 174.2 167 167.8

1142 J. Chem. If. Comput. Sci., Vol. 34, No. 5, 1994

YANG ET AL.

Table 4 (Continued)

compounds 4-methylnonane 5-methylnonane 4-isopropylheptane 4-ethyloctane 3-ethyloctane 2,7-dimethyloctane 2,6-dimethyloctane 2,3-dimethyloctane 2,5-dimethyloctane 2,4-dimethyloctane 4-isopropylheptane 4-ethyl-2-methylheptane 3-ethyl-2-methylheptane 5-ethyl-2-methylheptane 3,6-dimethyloctane 3,4-dimethyloctane 3,5-dimethyloctane 4,5-dimethyloctane 4-ethyl-3-methylheptane 5-ethyl-3-methylheptane 3-ethyl-4-methylheptane 3,4-diethylhexane 2,2-dimethyloctane 3,3-dimethyloctane 4,4-dimethyloctane 3,3-diethylhexane 4-ethyl-4-methylheptane 3-ethyl-3-methylheptane 2,3,6-trimethylheptane 2,4,6-trimethylheptane 3-isopropyl-2-methylhexane 2,5-dimethyl-3-ethylhexane 2,3,5-trimethylheptane 2,4,5-trimethylheptane 2,3,4-trimethylheptane 2,4-dimethyl-3-ethylhexane 2,3-dimethyl-4-ethylhexane 3,4,5-trimethylheptane 2,2,6-trimethylheptane 2,2,5-trimethylheptane 2,2,3-trimethylheptane 2,2,4-trimethylheptane 2,2-dimethyl-3-ethylhexane 2,2-dimethyl-4-ethylhexane 2,5,5-trimethylheptane 2,3,3-trimethylheptane 2,4,4-trimethylheptane 3,3,5-trimethylheptane 3,3,4-trimethylheptane 3,4,4-trimethylheptane 3,3-dimethyl-4-ethylhexane 3,3-diethyl-2-methylhexane 2,3-dimethyl-3-ethylhexane 2,4-dimethyl-4-ethylhexane 3,4-dimethyl-3-ethylhexane 2,4-dimethyl-3-isopropylpentane 2,3,4,5-tetramethylhexane 2,2,4,5-tetramethylhexane 2,2,3,5-tetramethylhexane 2,2,4-trimethyl-3-ethylpentane 2,2,3,4-tetramethylhexane 2,2,3,5-tetramethylhexane 2,3,4,4-tetramethylhexane

2,3,3,4tetramethylhexane 2,3,4-trimethyl-3-ethylpentane 2,2,5,5-tetramethylhexane 2,2,4,4-tetramethylhexane 2,2,3,3-tetramethylhexane 2,2,3-trimethyl-3-ethylpentane 3,3,4,4-tetramethylhexane 2,2,3,4,4-~entamethylpentane 2,2,3,3,4-pentamethylpentane

W

Z

X

J

EAZ

150 149 138 141 145 151 146 143 143 142 131 134 134 138 141 137 138 135 129 133 130 125 146 138 134 121 126 129 136 135 124 127 131 130 128 122 123 125 139 134 130 131 122 126 131 127 127 126 123 122 118 114 119 122 117 117 121 124 123 115 118 120 116 115 112 127 119 115 110 111

79 80 81 83 84 65 69 71 68 67 72 70 73 72 74 75 71 73 77 76 76 80 60 66 64 76 69 72 61 56 63 62 64 63 65 67 68 70 51 55 57 52 58 56 57 59 53 59 62 61 64 68 63 60 68 54 58 47 48 50 53 49

4.808061 4.808 061 4.846 066 4.846066 4.846066 4.625 898 4.663 903 4.680739 4.663903 4.663903 4.718 744 4.701 908 4.718 744 4.701 908 4.701 908 4.718744 4.701 908 4.718 744 4.756750 4.739913 4.756749 4.794754 4.560660 4.621 321 4.621 321 4.742640 4.681 981 4.681 981 4.536582 4.519 745 4.591 423 4.574 586 4.574 586 4.574586 4.591 423 4.629428 4.629428 4.629427 4.416502 4.454507 4.481381 4.454507 4.519386 4.492512 4.477 163 4.504036 4.477 163 4.515 168 4.542041 4.542041 4.580046 4.625 356 4.564696 4.537823 4.602701 4.464 102 4.464 102 4.327 186 4.337223 4.392064 4.392064 4.359879 4.414 720 4.424757 4.447412 4.207 107 4.267 767 4.310660 4.371 320 4.371 320 4.154701 4.193 376

2.968 015 2.998419 3.295082 3.205535 3.086901 2.909 472 3.033297 3.129 602 3.124402 3.160036 3.499 857 3.390786 3.397 790 3.255 530 3.168 174 3.308840 3.268 555 3.375929 3.563688 3.412256 3.529 931 3.698 220 3.043 758 3.276962 3.417 512 3.874775 3.690295 3.575505 3.301403 3.337 430 3.728 003 3.603 336 3.461 674 3.502718 3.583 327 3.797908 3.756 108 3.685406 3.205456 3.355 508 3.518 426 3.469 465 3.808926 3.630845 3.464727 3.633 392 3.625 600 3.641 863 3.778 372 3.823 180 3.971 137 4.153 513 3.943 570 3.802 578 4.020494 3.983 500 3.813 995 3.684242 3.734823 4.072918 3.941 813 3.865559 4.034 119 4.089289 4.228 989 3.563001 3.887 595 4.101 784 4.328 342 4.281 757 4.231 133 4.403818

13.39022 14.03790 12.96797 13.560 17 13.57478 14.615 88 14.63300 14.64522 14.52239 14.43102 14.091 77 14.11981 14.08972 14.18572 15.25524 15.18455 14.398 18 14.39487 14.62580 14.731 24 13.877 44 13.99208 15.131 77 15.14273 14.94576 14.58404 14.94670 15.48928 15.76726 14.80487 15.08490 15.18979 15.66241 15.583 71 15.62695 14.98369 15.12535 16.322 97 15.651 22 16.31585 16.25524 15.381 43 15.66329 15.783 09 16.29656 16.23434 15.27246 16.129 13 16.14945 16.08077 15.585 07 14.99200 15.875 25 15.866 17 16.58203 15.27807 16.860 020 16.651 36 16.65644 15.96457 17.38966 16.56671 17.37724 17.36261 17.07437 17.381 50 17.09826 17.153 43 17.63852 17.89499 18.454 19 18.43252

111

108

55

56 57 41 43 47 52 53 40 43

is the degree of the ith vertex, and the V of 2-methylbutane is V = {1,3,2,1,1}

EAmax 2.412358 2.409867 2.231 204 2.235 614 2.242 122 2.659 508 2.646 8 12 2.779 339 2.663 637 2.720603 2.685 158 2.679002 2.688 596 2.650472 2.508 182 2.621908 2.573 861 2.615 580 2.506718 2.496 523 2.502991 2.371 022 3.902330 3.539 593 3.530 638 2.781 714 3.149601 3.159 644 2.797082 2.787 556 2.824479 2.772741 2.808 73 1 2.768 814 2.839 443 2.748271 2.797 341 2.729738 3.902401 3.902960 3.876 995 3.910422 3.854 864 3.907 570 3.541 770 3.560056 3.552846 3.553 431 3.525 855 3.518 440 3.494951 2.951 106 3.230296 3.206 202 3.172749 2.924988 2.924877 3.911 134 3.878260 3.859 398 3.878834 3.579 268 3.539 185 3.547729 3.287522 3.954520 3.976226 4.05223 1 3.949 285 3.821 292 4.020 850 4.050 615

bp ("C

165.7 165.1 162.0 163.64 166.0 159.87 158.54 164.31 156.8 153.0 160.0 160.0 166.0 159.7 160.0 166.0 160.0 162.1 167.0 158.3 167.0 162.0 155.0 161.2 157.5 166.3 167.0 163.8 155.7 144.8 163.0 157.0 157.0 157.0 163.0 164.0 164.0 164.0 148.2 148.0 158.0 147.7 159.0 147.0 152.8 160.1 153.0 155.68 164.0 164.0 165.0 174.0 169.0 158.0 170.0 157.04 161.0 148.2 148.4 155.3 154.9 153.0 162.2 164.59 169.44 137.46 153.3 158.0 168.0 170.5 159.29 166.05

The extended adjacency matrix (EA matrix) is represented as follows:

J. Chem. Inf. Comput. Sci., Vol. 34, No. 5, 1994 1143

EXTENDED ADJACENCY MATRIXINDICES

Table 5. Results of Relationships of 149 Structures

141

143

142

indices W bp/OC

141

145

Z

x 146

147

148

149

J

I50

EA

a@&@ 4 152

151

153

. .

157

156

158

I59

160

,w,

+ +

X, = -2.657 551

9

/ 3 I 4

10

11

S

0.9844

2281.71

12.08

0.9710

1202.20

16.81

0.9879

2951.82

10.00

0.9101

352.29

27.81

0.9914

4178.34

9.46

12

X , = 2.657 550 X , = 3.318 116 X 10" X , = 1.108 644 X , =-1.108 644

EAC and EAmax are calculated respectively on the basis of the definitions

&BW 2

+ +

149.99 0.14W1.19Wfz bp/OC 138.23 0.55Z 2.42Z'f2 bp/'C = -5.13 - 3 . 1 4 ~ 8.65~~ bp/'C = 78.59 - 36.91J 15.5152 b p / T = 134.19 + 3.45EAE 10.43EAmax

Figure 1. Skeletons of organic compounds showing different ring structures including unusual ring systems.

1

F

constructed. EAC is the sum of the absolute eigenvalues of the EA matrix, and EAmax is the maximum of the absolute eigenvalues of the EA matrix. EAC and EAmax are together called the EA indices. We also take 2-methylbutane as an example for which the eigenvalues of its EA matrix are

15s

I54

R

ep

E A 1 = w,l+

wzl+ w31+ w,l+ w,l=

7.532390

EAmax = max(wA) = 2.657551

According to the preceding discussion, the hydrogensuppressed graph is a simple graph, and neither heteroatoms nor multiple bonds were considered. Thus, the indices derived from these graphs could not be expected to possessing high selectivity; for example, 2-methylbutane, 2-butanol, and 3-methyl-1-butene have the same chemical graphs. Of course, the EA indices are the same. In order to increase the discriminating power of the new indices, two factors, heteroatoms and multiple bonds, embraced within species should be subsequently taken into account. 1. Heteroatoms. The electronegativity is the quantitative property of an atom, which is the power of an atom in a molecule to attract electrons to itself and is concerned with atoms in molecules. Therefore, we introduced the electronegativities in our topological indices so as to reflect the influences of heteroatoms. Table 1 shows the electronegativities of common atoms. If a compound contains heteroatoms, the degree vector and the elements of the EA matrix will be changed as follows:

qJ-Ja+-+@-+ 13

14

17

18

16

16

Figure 2. Skeletons of partial cyclic graphs with eight vertices and the degree of each vertex being 4.

u: = uiELCi

(2)

whereas

&. = ELC,

(3)

where gij is an element of the EA matrix and ai, denotes an element of matrix A. So, the extended adjacency matrix of 2-methylbutane is 0 1.67 0 0 0 1.67 0 1.08 1.67 0

where ELCi is the electronegativity of atom i; u: is used instead of vi and is the new gii. 2. Multiple Bonds. The further extension is to enable improved differentiation to better characterize structures containing double and triple bonds. u," = ui, h

+ 'I,, v," = v," + 'I3,

u," = v,"

1.67 0 Then, the eigenvalues of the EA matrix can be calculated, and the new topological indices-EAC and EAmax-will be

b, = 1

u," = uj

+ 'I2,

v? = uj + 'I,,

b, = 2 bij = 3

where bij is the bond type to connect atoms i and j and u," is

1144 J . Chem. In& Comput. Sci., Vol. 34, No. 5, 1994

YANG ET AL.

Table 6. Two Topological Indices and Boiling Points of 37 Alcohols

no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

compound methanol ethanol 1-propanol 2-propanol 1-butanol 2-butanol 2-methyl- 1-propanol 2-methyl-2-propanol 1-pentanol 2-pentanol 3-pentaol 2-methyl- 1-butanol 3-methyl-1 -butanol 2-methyl-2-butanol 3-methyl-2-butanol 2,2-dimethyl- 1-propanol 1-hexanol 2-hexanol 3-hexanol 2-methyl- 1-pentanol 3-methyl- 1-pentanol 4-methyl- 1-pentanol 2-methyl-2-pentanol 3-methyl-2-pentanol 4-methyl-2-pentanol 2-methyl-3-pentanol 3-methyl-3-pentanol 2-ethyl-1 -butanol 2,2-dimethyl- 1-butanol 2,3-dimethyl- 1-butanol 3,3-dimethyl- 1-butanol 2,3-dimethyl-2-butanol 3,3-dimethyl-2-butanol 1-heptanol 1-octanol 1-nonanol 1-decanol

X

1X"

1.000 000 1.414 214 1.914 214 1.732 051 2.414 213 2.270 056 2.270 56 2.000 000 2.914 213 2.770 055 2.808 060 2.808 060 2.770 056 2.560 660 2.642 734 2.560 660 3.414 213 3.270 055 3.808 060 3.808 060 3.808 060 3.270 055 3.060 660 3.180 739 3.125 897 3.180 739 3.121 320 3.346 065 3.121 320 3.180 739 3.060 660 2.943 376 2.943 376 3.914 213 4.414 214 4.914 214 5.414 214

0.447 214 1.023 335 1.523 335 1.412 899 2.023 335 1.950 904 1.879 177 1.723 607 2.523 335 2.450 904 2.488 909 2.417 181 2.379 177 2.284 267 2.323 583 2.169 781 3.023 335 2.950 904 2.988 909 2.917 181 2.917 181 2.879 176 2.784 266 2.861 588 2.806 746 2.861 588 2.844 927 2.955 186 2.730 441 2.789 860 2.669 781 2.666 983 2.624 224 3.523 335 4.023 335 4.523 335 5.023 335

the new ui. Similar to eq 1, we have vp/v;

g.. = a , , Y

Y

+ v;/vp 2

(4)

and the indices are expected to reflect the influences of heteroatoms orland multiple bonds will be generated from the modified EA matrix. For instance, the indices EAC and EAmax tocompound 2-butanolare 13.500 00 and 5.149 825, respectively. EXAMINATION OF UNIQUENESS The purpose of topological indices is to classify structures and to serve for structure-property correlations. Nonuniqueness is not necessarily a disadvantage when correlation study is the prime target, because there are compounds with similar or the same properties which requires a similar or a same index. However, interest continues in trying to devise an index that would be unique. In this section we will exam the selectivity of the new indices introduced in this study. For examining the selectivitiesof the EA indices,over 20 000 structures and graphs have been detected. These structures and graphs include the following families. (1) Acyclic alkane molecular graphs up to n = 16 (n > 16, not to be detected), the total number of alkane isomers being 18 030, can be differentiated without degeneracy by using the EA indices. (2) The structures and graphs selected by Randie* for examining the uniqueness of ID numbers were detected, containing monocyclic graphs up to n = 8 (1 12 cases), bicyclic graphs up to n = 7 (79 cases), all graphs on five vertices, sesquiterpenes (30 cases), and miscellaneous, etc. The EA

EAI: 6.000 000 8.500 000 1 1.000 00 11.209 88 13.500 00 13.500 00 13.748 93 16.399 06 16.000 00 16.000 00 16.000 00 16.000 00 16.275 47 18.161 19 16.536 68 18.799 45 18.500 00 18.500 00 18.500 00 18.500 00 18.500 00 18.780 81 20.643 11 18.699 99 18.927 43 18.856 79 19.876 70 18.500 00 20.570 58 19.047 25 21.304 26 20.695 65 21.257 00 2 1.ooo 00 23.500 00 26.000 00 28.500 00

EAmax 4.169 124 4.466 509 4.484 931 5.358 974 4.494 513 5.149 825 5.169 988 6.616 247 4.500 160 5.138 161 4.957 134 4.963 617 5.147 513 6.279 202 5.391 488 6.409 247 4.503 749 5.135 648 4.946 299 4.952 079 4.935 991 5.142 475 6.271 849 5.271 415 5.299 961 5.276 77 5.927 895 4.790 382 6.049 779 5.290 365 6.402 860 6.281 723 6.395 639 4.506 150 4.507 8 13 4.508 997 4.509 859

bp ("C) 64.7 78.3 97.2 82.3 117.7 99.6 107.9 82.4 137.8 119.0 115.3 128.7 131.2 102.0 111.5 113.1 157.7 139.9 135.4 148.0 152.4 151.8 121.4 134.2 131.7 126.5 122.4 146.5 136.8 149.0 143.0 118.6 120.0 176.3 195.2 213.1 230.2

indices show a remarkable ability to discriminate among the structures and graphs. As an example, the EA index values and the corresponding skeletons for graphs 141-160 as numbered by Randie are shown in Table 2 and Figure 1, respectively. Evidently, all of these structures can be discriminated. (3) For more rigorously determining the uniqueness of EA indices, the cyclic graphs having n = 8 vertices, the degree of each vertex being 4, offer novel comparisons. All of the 204 structures were enumerated with the program ISOM,9 developed for elucidation of structures of organic compounds in our laboratory. The EA indices also show a high selectivity, because most of the very closely related structures can be discriminated by the EA indices, although there are some pair counter examples having the same index values. As an example, Figure 2 shows the partial cyclic graphs of these structures, and Table 3 gives the corresponding values of E A 1 and EAmax. In which, 11 and 12,13 and 14,15 and 16, and 17 and 18 are the degeneracy pairs. Note, that if two structures do not have the same EAC and EAmax index values simultaneously, these two structures are considered to be discriminated in this study. APPLICATIONS TO CORRELATION Topological indices developed for the purpose of obtaining correlations with physicochemical properties and biological activities of chemical substances have been applied for a very extensive range. The current major applications include bibliographical species classification, physicochemical parameter evaluation, and pharmaceutical drug design. In this section the employed results will be given for alkanes, alcohols, and barbiturates.

J. Chem. Znf. Comput. Sci., Vol. 34, No. 5, 1994 1145

EXTENDED ADJACENCY MATRIXINDICES Table 7. Results of Relationships for 37 Alcohols ~

eq

R

F

S

bp/OC 29.64 + 88.831xv bp/OC = -34.00 + 168.10'~- 127.98'~' bp/OC = 132.14 + 7.70EAE26.36EAmax

0.9556 0.9923

1088.86

10.29 4.39

0.9838

511.90

6.35

indices Ixv lxlx'

EA

Table 8. log P and the EA Indices for Barbiturates with Structure I R1

no. 1 2 3 4 5 6 7

R2

ethyl methyl ethyl ethyl propyl ethyl isopropyl ethyl methyl methyl ethyl methyl propyl methyl 8 isopropyl methyl propyl 9 methyl propyl 10 ethyl 11 methyl isopropyl butyl 12 methyl 13 ethyl butyl 14 ethyl ethyl

Table 9.

no. I5 16 17 18 19 20 21 22 23 24 25

1011P

R3

EAE

EAmax

logP

methyl methyl methyl methyl ethyl ethyl ethyl ethyl methyl methyl methyl methyl methyl propyl

42.442 200 44.295 792 46.774 139 47.172211 42.610088 44.549950 47.032730 46.949 421 44.936 668 46.786049 45.041 950 47.435 841 49.283 989 49.262020

5.588 789 5.355 969 5.349 703 5.443 005 5.647092 5.415 510 5.406 142 5.412 628 5.587068 5.354 176 5.610 556 5.586836 5.353 878 5.350 319

1.15 1.65 2.15 1.95 1.15 1.65 2.15 1.95 1.65 2.15 1.45 2.15 2.65 2.65

It is clear that the better result can be obtained with EA indices, though the results obtained using x and lxVis slightly superior to ours. 3. Barbiturates. Barbiturates were thought to be nonspecific narcotic agents principally because log P (P = partition coefficient in octanol-water) correlates very well with their biological potency.ll Other studies1*showed a dependence of the action of barbiturates upon chemical structure. Therefore, it was of interest to carry out a correlation analysis of log P and topological parameters. Correlations between the EA indices and log P for barbiturates have been revealed by this investigation. The EA indices and log P for 25 barbiturate acid derivatives with structure I and structure I1 are list in Table 8 and Table 9, respectively.

ox=;H R2

0

NKN 0

R2

methyl ethyl propyl butyl methyl ethyl propyl butyl isobutyl amyl isoamyl

1-methyl, lapropenyl 1-methyl, 1-propenyl 1methyl, 1-propenyl 1-methyl, 1-propenyl 1-methylvinyl 1methylvinyl 1methylvinyl 1methylvinyl 1-methylvinyl 1methylvinyl 1methylvinyl

EAE 40.068 581 41.963 631 44.444 069 46.940 262 37.686 401 39.664 139 42.148 720 44.646 042 44.718 639 47.145 580 47.212 662

EAmax

log P -

5.633 902 0.65 5.396 735 1.15 5.390 218 1.65 5.388 160 2.15 5.675 070 0.15 5.447 943 0.65 5.441 714 1.15 5.440 750 1.65 5.461 265 1.45 5.440 601 2.15 5.459 260 1.95

1. Alkanes. It is of interest to test methods on data for alkanes because good data are generally available for complete isomer sets. The first 150alkanes (Le., up to 10carbon atoms) have been used by Mihalic et al.1° for a comparison of performances of the 10 distance indices and 2 connectivity indices in the structure-boiling point correlations. In this paper, the same set of compounds except methane, Le., 149 alkanes, were employed, and EA indices of all of the 149 alkanes were calculated and are listed in Table 4. For comparisons, the index values calculated by other schemes, W,' Z,2x , and ~ J4 are also given in Table 4. Table 5 shows a summary of the correlation coefficients, r, standard deviations, s, and F-test values by all of the indices. Evidently, the best result has been achieved by using EA indices. 2. Alcohols. Because the most interesting structures possessing activity are rather complex molecules with multiple bonds and/or heteroatoms, it is quite important that topological indices are able to correlate these kinds of molecules. Alcohol containsone heteroatom, oxygen, whose ELC is 3.5 (see Table 1). Similarly, for comparisons, besides boiling points of 37 alcohols and their EA indices, x and lxV are also given in Table 6. The results of the correlation analysis are shown in Table 7 .

NKN 0 II

I

and the EA Indices for Barbiturates with Structure I1

R1

/R3

The results of thecorrelation analysis between the EA indices and log P values of these compounds are log P = -2.8302

R = 0.9910

+ 0.1900EAC - 0.7395EAmax

F = 602.8965

S = 0.0861

n = 25

ACKNOWLEDGMENT The financial support of the Natural Science Foundation of China is gratefully acknowledged. REFERENCES AND NOTES (1) Wiener, H. Structural Determination of Paraffin Boiling Points. J. Am. Chem. Soc. 1947,69, 17-20. (2) Hosoya, H. Topological Index. A Newly Proposed Quantity Characterizing the Topological Nature of Structural Isomers of Saturated Hydrocarbons. Bull. Chem. Soc. Jan. 1971,44, 2332-2339. (3) RandiC, M. On CharacterizationofMolecular Branching. J. Am. Chem. Soc. 1975, 97,66094615. (4) Balaban, A. T. Topological Indices Based on Topological Distances in Molecular Graphs. Pure Appl. Chem. 1983,55, 199-206. (5) Kier, I. B.;Hall, L. H. Molecular Connectivity in Chemistry and Drug Research; Academic: New York, 1976. (6) RandiC, M.; Trinajstic, N. In search for graph invariants of chemical interest. J. Mol. Srruct. (THEOCHEM) 1993, 300, 551-571. (7) RandiC, M.; Trinajstic,N. Viewpoint 4-Comparative structure-property studies: the connectivity basis. J. Mol. Strucr. (THEOCHEM) 1993, 284,209-221. ( 8 ) RandiC, M. On Molecular Identification Numbers. J . Chem. Inf. Comput. Sci. 1984, 24, 164175. (9) Hu, C.-Y.; Xu, L. Studies on Expert System for the Elucidation of the Structures of Organic Compounds-StructuralGenerator of ESESOC11. Sei. China, in press. (10) Mihalic, Z.; Nikolic, S.; Trinajstic, N. ComparativeStudy of Molecular Descriptors Derived from the Distance Matrix. J. Chem. Inf. Compur. Sei. 1992, 32, 28-37. (1 1) Hansch, C.; Anderson, S. M.Structure-Activity Relation in Barbiturates and Its Similarity to That in Other Narcotics. J . Med. Chem. 1967,10, 745-153. (12) Bask, S. C.; Monsrud, L. J.; Rosen, M. E.; Frane, C. M.; Macnuson, V. R.A Comparative Study of Lipophilicity and Topology Indexes in Biological Correlation. Acra Pharm. Jugosl. 1986, 36, 8 1-95.