Ab initio computation of molecular similarity - American Chemical

Aug 8, 1984 - Philippa E. Bowen-Jenkins,* David L. Cooper, and W. Graham Richards. Physical Chemistry Laboratory, South Parks Road, Oxford 0X1 3QZ, ...
0 downloads 0 Views 267KB Size
J. Phys. Chem. 1985,89, 2195-2197

2195

Ab Initio Computation of Molecular Similarlty Philippa E. Bowen-Jenkins,* David L. Cooper, and W. Graham Richards Physical Chemistry Laboratory, South Parks Road, Oxford OX1 3QZ. England (Received: August 8, 1984)

A quantitative, ab initio method for the assessment of molecular similarity is presented. Development of this procedure was carried out with the ultimate aim of using the molecular similarity indices so obtained to help predict biological activities of larger molecules. In order to evaluate the method, three model systems were studied: hydrogen molecules; the isoelectronic pairs CO/N2 and C 0 2 / N 2 0 ;the series of simple isosteres CH3CH2CH3,CH30CH3,and CH3SCH3.

Introduction The concept of molecular similarity is important both scientifically and commercially. This is exemplified by the dependence of the pharmaceutical industry on the design of novel compounds which simulate the behavior of known compounds. Such design is often carried out by a procedure based on the idea of bioisosterism.' Bioisosterism is a term which groups together physical and chemical properties believed to be important in determining biological activities, such as molecular size, shape, and electron distribution. In view of this, the criterion chosen for assessment of molecular similarity is electron density distribution; comparison of molecular size and shape is, to a certain extent, intrinsic to this choice since the nuclear positions determine electron distribution in molecules. This property is an obvious choice, since chemical and biological activity depends critically on electron distribution. A suitable index of molecular similarity is defined by the equation

where p A is the electron density function for molecule A and integration is over all space. It is apparent that values of RAB will lie between zero and unity, with RAB = 1 indicating perfect similarity of the species being compared. This definition was suggested by Carbo et a1.2who did not, however, compute electron densities accurately from molecular wave functions but used a CNDO-type approximation.

Derivation of an Approximate Expression for the Similarity Index The central idea of Carbo's method? that an atomic electron density distribution could be approximated as the product of an effective atomic charge multiplied by the square of an ns Slater function, is preserved in this derivation. Slater suggested that the radial part of a one-electron wave function could be taken to be

~ ( r=) Nr"-' exp(-[r),

N =

(2t)"+'/2

(2n!)1/2

where n is the effective principal quantum number, and is an arbitrary positive number. The squaring of this function produces a nonnormalized function of similar form:

with

where f' = 2t and n' = 2n - 1. Approximating the electron density functions in the same way as Carbo implies that SPAPB

dv =

c Q;'QF

!€A

i X t i Z i X ~ i ~dr r2

JEB

where @ is the effective charge on atom i in molecule A. It follows that

where we have defined

Calculations and Results Geometries of the simple systems studied were obtained from a standard c ~ m p i l a t i o n and , ~ wave functions were obtained by using the GAUSSIAN SO4 package; minimal STO-3G basis sets were used in all cases. Three model systems were studied: hydrogen molecules; the isoelectronic pairs CO/N2 and CO,/NzO the series of simple isosteres CH3CHzCH3,CH30CH3, and CH3SCH3. Electron density functions were calculated directly from the wave functions and RAB was evaluated according to eq 1. For comparison of much larger molecules, this approach would be rather expensive and the CNDO-type approach might be more viable. Effective atomic charges were obtained from Mulliken population analyses of the GAUSSIAN80 wave functions, and RABwas evaluated according to eq 2. The quality of the wave function could have been improved by using a larger basis set or by the inclusion of electron correlation. However, it was decided,that a STO-3G basis was adequate for comparing the ab initio and CNDO-type estimations of the molecular similarity index, and for the other numerical comparisons presented here. Indeed, for larger molecules in which we are ultimately interested, it would not be feasible to perform very high accuracy calculations. It is apparent from the definition of the similarity index that its value depends on the relative disposition of the molecules being compared. Hydrogen molecules were chosen as a suitably simple example for investigating the variation of the index as identical molecules are moved away from coincident positions. The results for relative motion along the bond axes are shown in Figure 1; the results for perpendicular motion are very similar, but the similarity indices decrease slightly faster. In addition, the ab initio (3) 'Tables of Interatomic Distances and Configurations in Molecules and

(1) C. W. Thornber, Chem. SOC.Rev. 7, 563 (1979). (2) R. Carbo, L. Leyda, and M. Amau, Inr. J. Quantum Chem., 17, 1185 (1980).

0022-3654/85/2089-2195$01.50/0

Ions", L. E. Sutton, Ed., The Chemical Society, London, 1958, Spec.Publ. No.11; Supplement (1956-1959), L. E. Sutton, Ed., The Chemical Society, London, 1965, Spec.Publ. No. 18. (4) J. S.Binkley, R. A. Whiteside, R. Krishnan, R. Secger, D. J. de Frets, H. P. Schlegel, S.Topiol, L. R. Khan, and J. A. Pople, QCPE, 13,406 (1981).

0 1985 American Chemical Society

The Journal of Physical Chemistry, Vol. 89, No. 11, 1985

2196

\

x

D w .-t Ir

-.-E

Bowen-Jenkins et al.

Relative motion along coincident ixes.

-

0.6-

d

i7,

0.4 -

0.2 -

I

,

,

I

,

,

,

-0.30 -0.20 -0.10

Distance between centroids (A) Figure 1. Similarity indices for hydrogen molecule comparisons with

0.10 0.20 0.30 Distance between centroids(A1 Figure 2. C02/N20comparisons using the ab initio value of

RAB.

relative motion along the bond axes. TABLE I Values of the Similarity Index for Pairs of Isoelectronic Molecules4

method of matching COIN2

C02IN2O

la lb 2 3

la lb IC

2 3

ab initio RAB

0.913 0.941 0.946 0.948 0.959 0.908 0.963 0.963 0.965

A/A 0.02 -0.02 0.00 0.01

0.00 0.03 0.00 0.00 0.01

CNDO-type RAB

A/A

0.93 0.93 0.93 0.93 0.97 0.96 0.96 0.97 0.97

0.02 -0.02 0.00 0.02 0.00 0.03 0.00 0.00 0.00

'The methods of matching are defined in the text. a, b, and c refer to different choices of atoms superimposed. A is the separation between centroids. values of RAE decrease faster than those from the CNDO-type approximation. Calculations were then performed on isoelectronic pairs of molecules so as to gain an idea of the values of the similarity index obtained from matching species with similar structures and also so as to investigate the effect on the index of various methods of molecular matching. C O is compared with N2, and C 0 2 with N,O. The molecules were placed along the same axis and three different methods of molecular matching were used: (1) selected pairs of atoms were directly superimposed: (2) centroids of the molecules were superimposed; (3) the relative disposition of the molecules was adjusted until Rm was maximized. Results of these calculations are listed in Table I. The ab initio value of the similarity index falls off more rapidly with displacement from the optimum overlap position than does the more approximate value. This is because the core orbitals present in the ab initio approach are characterized by large scale factors which minimize the values of the integrals to which they contribute. The difference between the two approaches would be expected to increase for molecules containing atoms of higher atomic number. The variation in the magnitude of the (ab initio) similarity index near the position of

TABLE I 1 Values of the Similnritv Index for Isosteric Molecules" method of ab initio CNDO-type matching RAB A/A RAB

a/A

(CH?)IO/(CH3)2CH2 (CH3)20/(CHdzS (CH?)2CH2/(CH3)2S

1 2 3 1 2 3 1 2 3

0.660 0.641 0.673 0.643 0.099 0.643 0.459 0.165 0.459

0.05 0.00 0.03

0.20 0.00 0.20 0.16 0.00 0.16

0.87 0.90 0.90 0.83 0.84 0.95 0.87 0.90 0.90

0.05 0.00 0.00 0.20 0.00 0.10 0.16 0.00 0.00

OThe methods of matching are defined in the text. A is the separation between centroids. maximum overlap is shown for the N 2 0 / C 0 2 comparison in Figure 2. Molecules which are similar in size and shape are termed isostere~.~Calculations were performed on the simple isosteres CH3CH2CH3,CH30CH3,and CH3SCH3in order to find typical values of RAE for such comparisons, and in order to investigate further different criteria for matching molecules. This series was considered particularly interesting in that the bioisosteric replacement of a sulfur atom for the CH2 group has been exploited commercially. This is illustrated by the development of the histamine H2 antagonists metiamide and cimetidine from burimamide.6 Three methods for molecular matching were considered: (1) direct superposition of the central atoms and least-squares matching of the positions of the remaining atoms; (2) least-squares matching of the positions of all the main atoms; (3) translation of the molecules along the principal axes until RAB reached a maximum. The program CRYSTALS' was used for the second (5) I. Langmuir, J . Am. Chem. SOC.,41, 865 (1919). (6) W. G. Richards, "Quantum Pharmacology", Butterworth, London, 1983, 2nd ed, p67. (7) D.J. Watkin, program CRYSTALS, Chemical Crystallography Laboratory, Oxford.

The Journal of Physical Chemistry, Vol. 89, No. 11, 1985 2197

Ab Initio Computation of Molecular Similarity

1.00

deviation was minimized.* The results of the various calculations are listed in Table I1 and it is again clear that the ab initio value decreases much more rapidly with increased separation A of the molecules than does the CNDO-type value of RAB. The CNDO-type value is insensitive to the method of matching; the ab initio value is remarkably sensitive to A for the (CH,),O/ (CH3)2Scomparison. The variation in the magnitude of the (CNDO-type) similarity index near the position of maximum overlap is shown for the (CH3)2CH2and (CH3)2Spair in Figure 3.

Discussion In the comparison of isoelectronic pairs of molecules, the CNDO-type approach (eq 2) gives values for the similarity index close to the ab initio values (eq 1). However, the two sets of predictions for the isostere series are very different. This would suggest that the CNDO-type approach gives a poor approximation to eq 1 and does not give a reliable measure of molecular similarity. An alternative procedure, which is potentially more useful than electron density comparisons, is to compare electrostatic potentials or even electric field strengths. These might predict initial interactions between a given drug and its receptor semiquantitatively. Moreover, since they take both positive and negative values (unlike p ) they might provide a more sensitive indication of molecular similarity. ,

1

1

1

0.4 0.8 Distance between centroids (A) Figure 3. Comparison between (CH3),CH2 and (CH3)2Susing the CNDO-type value of RAB. method. First the molecules were positioned so that their centroids coincided, as it has been shown that the best fit is likely to occur when this condition is fulfilled. The positions of specified atoms in each molecule were then adjusted such that the mean-square

Acknowledgment. These studies were in part conducted pursuant to a contract with the National Foundation for Cancer Research. We thank the SERC for computing facilities. Registry No. CO, 630-08-0; N2, 7727-37-9; C02, 124-38-9; N20, 10024-97-2;(CHJ2CH2,74-98-6;(CHJ20, 115-10-6; (CH3)2S, 75-18-3; Hz, 1333-74-0.

(8)

D.J. Watkin, Acta Crystallogr., Sect. A , 36, 975 (1980).