Modeling Amino Acid Slde Chains. 1. Determination of Net Atomic

from the molecular electrostatic field and potential. The fitting procedure is analyzed critically, and the grid sensitivity is suppressed by avoiding...
0 downloads 0 Views 991KB Size
J. Phys. Chem. 1992,96, 10276-10284

10276

Modeling Amino Acid Slde Chains. 1. Determination of Net Atomic Charges from ab Initlo Self-Consistent-Field Molecular Electrostatic Propertles Christophe Chipot, Bernard Maigret,* Jean-Louis Rivail,* Laboratoire de Chimie ThCorique, Unit8 de Recherche Associ8e au CNRS No. 510, UniversitC de Nancy I, BP. 239, 54506 Vandmvre-l&- Nancy Cedex, France

and Harold A. %heraga* Baker Laboratory of Chemistry, Cornel1 University, Ithaca, New York 14853- 1301 (Received: July 21, 1992)

A comprehensive study of the naturally occurring amino acid side chains is presented. This includes a full ab initio SCF geometry optimization of each structure, using the 6-3 1G** split-valence basis set, and the derivation of net atomic charges from the molecular electrostatic field and potential. The fitting procedure is analyzed critically, and the grid sensitivity is suppressed by avoiding the effects of electron cloud penetration. Charges obtained from the fits reproduced the SCF dipole moments reliably, with no significant errors. However, in some specific cases, the inclusion of fictitious atom charges appeared to be necessary.

Introduction Most of the currently available force fields for treating proteinsl-“ involve pairwise potentials in which electrostatic interactions are represented by means of an atom-centered monopole expression. In addition to the nonbonded van der Waals terms, this component appears to be most important in the description of intermolecular interactions. This is the reason why net atomic charges are conceptually a very valuable tool, and recently have received a great deal of a t t e n t i ~ n . ~ - ~ ~ In this paper, we present the results of an ab initio study of the naturally occurring amino acid side chains, including a 6-31G** full geometry optimization and the determination of net atomic charges from accurate SCF wave functions. An extra hydrogen atom was added to each side chain -R of the residue -NHCHR-CO- in order to create a complete model molecule. With this approach, the glycine side chain merely reduces to HZ. Therefore, calculations for this system were omitted.

Geometry Optimization Since molecular electrostatic properties are very dependent on the molecular charge distribution and consequently on the position of the nuclei, it is necessary to use as accurate geometries as possible. However, because some of the side-chain experimental geometries are not available, recourse is had to 6-31G** geometry optimization for consistency. This polarization basis se.tzaappears to yield fairly good equilibrium structures, reasonably close to experimental data. Although most of the side chains will be considered as neutral species, some of them will be treated in their ionic form. Computations for anions require particular care since standard basis sets do not properly describe systems in which most of the valence electron density is assigned to diffuse lone pairs or antibonding orbitals. Nevertheless, it has been shown that the incorporation of diffuse function^^^.^' gives better results than standard basis sets because charges are spread out farther from the nuclei. For this reason, the charged aspartate and glutamate side-chain geometries will be optimized with the 6-31++G** basis set. Derivation of Net Atomic Charge8 from ab Initio SCF Molecular Electmtatic Properties Charges computed from a Mulliken population analysisz8turn out to be rather unsatisfactory,12especially for classical molecular dynamics (MD) or Monte Carlo (MC) simulations, because they do not reproduce the electrostatic potential and its derivatives we11.I2 Various approaches have subsequently been proposed to calculate point-charge models from SCF wave functions. These 0022-3654/92/2096-10276$03.00/0

schemes may be divided roughly into three categories, viz, distributed multipolar expansi~ns,Z~-~~ electrostatic potential- and field-derived m e t h o d ~ > ~ l and - ’ ~ partial equalization of orbital electronegati~ity.~~~~~-~~*~~

Metbod The method employed in the present work is very similar to the one developed by Cox and william^.^ It consists of fitting the molecular electrostatic potential,7J1-19or to a series of point charges placed at the atomic centers of a given molecule, using a least-squares calculation. Although it is not a requirement for the fitting procedure, a constraint has been imposed by means of a Lagrange multiplierll technique, so that the sum of the atomic charges is equal to the total molecular charge. Unlike atomic charges, the electrostatic potential P F ( r ) and field EWF(r)are molecular properties readily available from an SCF computation:

ESCF(r)=

where 2, is the nuclear charge on atom a! centered at R,. PNY is an element of the first-order density matrix for the primitive Gaussian functions Ix,) and IxJ. The sampling algorithm is equivalent to the one adopted by Cox and william^.^ The selection of points for the fitting procedure implies the construction of a parallelepiped-shaped grid The dimensions of the grid containing of regularly spaced the molecule involve an exclusion radius on all sides. Points found to lie further than rmsx(an arbitrarily chosen distance defining the grid boundaries) from any nucleus are discarded. Points inside the so-called van der Waals envelope are also rejected. In order to avoid penetration of electron clouds, which generally perturbs the representation of the SCF electrostatic potential, or field, at short distances, in point-charge m o d e l ~ , the 2 ~ ~size ~ ~of this van der Waals envelope has been over-dimensioned. However, more sophisticatedmethods have bem proposed to overcome this critical problem, e.g., distributed multipolar analysis (DMA), overlap and localized multipole expansion (OMTP and LMTP), and overlap multipolar Expansion (OME).2’31 The density of points used for the fit may be modified by adjusting the grid step Ar. Nevertheless, as shown in Table I, there is no noticeable influence of this density on the fitted net Q 1992 American Chemical Society

The Journal of Physical Chemistry, Vol. 96, No. 25, 1992 10277

Modeling Amino Acid Side Chains

TABLE I: Influence of Point Dearity 011 Potenti.1- (9v)and Field- ( 9 E ) Derived Net Atomic chrrgesgand Dipole Moments of Four Amino Acid Side C l u b , Us@ the Split Valence 6-31C** Basis Set Alanine h = 0.5 A Nplt= 2263 48

4v atomb C HI H2 H3

h = 0.4 A Npnt= 4430 4v

h = 0.5 A Npnt= 2263 4v qE

4E

h = 0.4 A Npll= 4430 4v

qE

0.140 0.049 0.000

0.142 0.047 o.Oo0

atom -0.561 0.140 0.140 0.140

-0.570 0.143 0.143 0.143

-0.561 0.140 0.140 0.140

-0.569 0.142 0.142 0.142

H4

rmsdc P

au)

(D)

0.140 0.049 o.Oo0

0.143 0.048 0.000

Arginine h = 0.5 A Npn,= 4434 4v

atom C, HI H2 H3

c2

H4

H5 C1 H6 H7 N,

-0.425 0.138 0.116 0.116 0.203 0.007 0.007 -0.005 0.080 0.080 -0.488

Ar = 0.4 A Npnl= 8650

4E

-0.455 0.150 0.128 0.128 0.133 0.034 0.034 -0.006 0.094 0.094 -0.583

4v

-0.416 0.135 0.114 0.114 0.199 0.008 0.008 -0.004 0.080 0.080 -0.488

h = 0.4 A Npl = 8650

h = 0.5 A Npll= 4434

4E

atom HS

-0.451 0.149 0.127 0.127 0.133 0.034 0.034 -0.011 0.095 0.095 -0.577

c4

N 2

N3 H9

HI0

HI1 Hi2

au)

rmsd P

(D)

4v

qE

4v

4E

0.322 0.909 -1.085 -0.902 0.507 0.515 0.466 0.438 0.219 5.782

0.353 1.009 -1.099 -0,962 0.504 0.512 0.478 0.454 0.184 5.769

0.322 0.905 -1.080 -0.899 0.506 0.513 0.466 0.437 0.2 18 5.782

0.352 1.004 -1.096 -0.961 0.503 0.5 12 0.478 0.454 0.186 5.772

Cystine

Ar = 0.5 A N,,, = 2234 9v

atom C,

SI S2 C2 HI H2

4E

Ar = 0.4 A N,,, = 5548 4v qE

Ar = 0.5 A N,,, = 2234

Ar = 0.4 A N,,, = 5548

4v

qE

9v

4E

0.177 0.214 0.182 0.177 0.803 2.362

0.186 0.218 0.190 0.186 0.529 2.351

0.178 0.214 0.182 0.178 0.801 2.363

0.188 0.219 0,191 0.188 0.529 2.347

atom -0,488 -0.085 -0,085 -0.488 0.214 0.182

-0.514 -0.080 -0.080 -0.514 0.218 0.190

-0.489 -0.085 -0.085 -0.489 0.214 0.182

-0.520 -0.079 -0.079 -0.520 0.219 0.191

H 3 H4

H5 H6

rmsd (lo-) au) P (D)

Histidine D h = 0.5 A Npll= 3295 4v 4E

atom Cl HI H2 H3 c 2

Ni c3

h = 0.4 A Npnt= 6386 4v qE

Ar = 0.5 A N,,, = 3295

Ar = 0.4 A N,,, = 6386

4v

qE

4v

4E

-0.573 0.046 0.380 0.121 0.123 0.391 4.037

-0,559 0.026 0.380 0.128 0.126 0.303 4.039

-0,576 0.056 0.381 0.122 0.119 0.390 4.039

-0.563 0.033 0.382 0.129 0.123 0.301 4.045

atom -0.693 0.205 0.205 0.188 0.229 -0,478 0.247

-0.655 0.193 0.193 0.180 0.237 -0.475 0.226

-0,713 0.21 1 0.211 0.194 0.229 -0.477 0.243

-0.662 0.195 0.195 0.182 0.238 -0.479 0.226

N2 c4

H4 H, H6

rmsd P

(D)

au)

In electronic charge units (ecu). Indices given in Figure 1. Root mean square deviation between the ab initio computed and the point charge derived electrostatic potentials and fields.

atomic charges, as long as the space surrounding the molecule is sampled properly.23 Computdoarl Det.ils--Algorithm

One of the major drawbacks of the Lagrange multiplier method introduced by Chirlian and Francl" is that the net atomic charges do not always reflect the symmetry of the molecule. Various schemes have been proposed to remedy this deficiency, viz., addition of further constraints and Lagrange multipliers on dipole and quadrupole moments16 when using highly symmetric molecules. Another very simplistic solution consists of evaluating the mean charge of equivalent atoms. However, such techniques may sometimes substantially modify the dipole moment as well as the nns (root mean square) deviation between the ab initio computed and the point-charge-derived electrostatic potentials and fields. In our alternative approach, we have carried out the fit on the families of symmetry-related atoms rather than on the individual atoms themselves.

Net Atomic Chargea Derived from Molecular Electrostatic Potential The least-squares-fit criterion is satisfied when the minimum of the following function is reached:

where the qi are the atomic charges that are optimized, Nm, denotes the number of selected points for the fitting procedure, is the number of families of equivalent atoms. is and Nfam the calculated electrostatic potential for a point p , at a position rP, whereas VQ is the electrostatic potential at point w, in the atom-centered monopole approximation:

ecF (4)

10278 The Journal of Physical Chemistry, Vol. 96, No. 25, 1992

Chipot et al.

where N,,, corresponds to the number of equivalent atoms of a given family I. In order to ensure that the s u m of the overall net atomic charges be q u a l to the total molecular charge, a constraint is imposed on eq 3: G(ql,q2,***,qNf,m) = qtolal = (5) I

The actual function to be minimized may therefore be written as zI(ql,q2~*-*,qNf,,)Yl(q1,q2,**.,qNfam) + XG(ql,q2,.-,qNf.,)

Following the preceding scheme, this equation may be expressed in a matrix form: A'Q

(6)

where X denotes the Lagrange multiplier. The net atomic charges for each group of atoms are obtained by finding the stationary points of the Lagrangian function Zl. In other words, one must solve

+ A = B'

lTq = Btotal

(17)

which involves the following matrix elements:

AK = X, V K = 1, ..., Nfam yielding

This expression may be written in terms of a matrix equation: Aq+A=B lTq = 9tolal where the corresponding matrix elements are

(9)

and where lT is the transpose of the vector (1, 1, ..., 1). The set of net atomic charges for each family of equivalent atoms is easily obtained by solving

(18)

Results and Discussion As already mentioned, the two most critical parameters in the fitting procedure are the van der Waals envelope and the exclusion radius. Electron cloud penetration is efficiently avoided when the van der Waals radius for each atom is doubled. Furthermore, in order to take into account the multipolar effects of higher order than dipoles, it is necessary to choose an optimum distance r . In our application, the largest van der Waals radius being 1 (sulfur), it appears that r,, = 4.0 A is a fairly good compromise. Tests with larger exclusion radii (viz., 5.0,6.0,and 10.0A) were carried out, but did not reveal any significant differences in charges. The grid step has been set at Ar = 0.5 A, so that the number of points selected for the fit is included between Npnt= 2263 (alanine side chain) and Npnt= 4434 (arginine side chain). In order to study the influence of the density of points on the net atomic charges, computations have been camed out for alanine, arginine, cystine, and histidine D side chains using AI = 0.4 A, which corresponds roughly to doubling the sampling density (see Table I). Our criterion for the quality of the fit is the rms deviation between the ab initio quantum mechanically calculated and the point-charge-derived electrostatic potentials, or fields:

.a

1'"

Net Atomic Charges Derived from Molecular Electrostatic Field The bcst fit of the charges to the electrostatic field is obtained by finding the minimum of the following quantity:

(19)

NP*

fl(ql,q2,**.,qNf,,) =

Z(EECF- E?>'(E,""' - E?)

e- 1

(l2)

where the upper index r stands for the Cartesian coordinates of the functional q.EfCFdenotes the quantum mechanically calculated field at point p. This quantity, as well as the SCF molecular electrostatic potential, is readily available from a GAUSSIAN M3* computation. E; is the electrostatic field truncated at the monopole level:

Using the same constraint on the total molecular charge, the function to minimize becomes The minium of this function and the corresponding charges are found by solving the system of linear equations:

yielding

These quantities are given in atomic units. Nevertheless, in order to correlate our results with those available in the literature, rmsdy should be multiplied by a conversion factor of 627.5095 (hartrees to kcal mol-') and rmsd, by a factor of 51.4225 (hartrees bohr-' to V A-I). The molecular electrostatic potential- and field-derived net atomic charges of the naturally occurring amino acid side chains (presented in Figure 1) are listed in Table I1 with their respective m deviations. In addition, the dipole moment may be interpreted as another criterion of the quality of the fit. It seems obvious that the dipole moment calculated from the best possible fit should be fairly close to the SCF value. As shown in Table 111, there are no noticeable discrepanciesbetween the SCF and the quantities calculated from point charges. In most cases,a low deviation h F - pv/d corresponds to a low value of the rmsd (seeTable 11),which therefore indicates the good quality of the fit.

tbe Model Since the best fits correspond to low values of the rmsd, it is possible to predict if the point charge model of a given molecule is suitable to mimic its electrostatic potential, or field. As shown in Table 11, this model turned out to be fairly well adapted to most

Improvements of

The Journal of Physical Chemistry, Vol. 96, No. 25,1992 10279

Modeling Amino Acid Side Chains

Asparagine

Alanine

Cysteine

Cystine

methionhe

Figme 2. Inclusion of dummy charges X in sulfur-containing amino acid side chains.

Cystine

Histidine D

Glutamine

I H3

Hirtkline

E

Histidine (cation)

LNJCilK

I

HI Phenylalanine

Threonine

Proline

Tryptophan

Tydne

vaiinr

Figure 1. Side chains of the naturally occurring amino acids, with an extra hydrogen atom to create a complete model molecule.

systems but appeared to be rather unsatisfactory in some specific cases.21,23*33*34 The situation is particularly critical for molecules involving lone pairs which generally lead to significant errors. A low value of the rmsd reflects the ability of the model to describe not only the dipole but also multipole moments of higher orders. It is now quite well-known that atom-centered models are not adequate and reliable for molecules such as water or hydrogen sulfide which have important quadrupole moments. Various studies have shown that the shift of the oxygen charge, or the inclusion of an off-atom point charge, along the bisector of the water molecule improves the fit s ~ b s t a n t i a l l y . ~A~ ~three~~-~~ charge model from the derived potential indeed yields a value of au and a dipole moment of 2.211 D, to the rmsd of 0.779 be compared with the SCF value of 2.185 D [6-31G**,with experimental geometry]. In contrast, a four-charge model (Le., inclusion of a dummy charge at dex = 0.24A, along the C2,axis, toward the hydrogens) yields a value of the rmsd of 0.233 lO-' au and a dipole moment of 2.189 D. Nevertheless, these results may be greatly improved by introducing additional off-atom charges.*l However, it is more difficult to describe the hydrogen sulfide charge distribution in terms of monopoles because of the higher quadrupolar terms which generate very large errors in the fitting procedure. The net atomic charge model applied to this molecule yields a value of the rmsd of 1.746 X lO-) au and a dipole moment of 1.478 D, noticeably far from the SCF value of 1.373 D (631G**, with experimental geometry). Application of the preceding four-charge model (dsx = 0.27 A, along the C,, axis, toward the hydrogens) is much more satisfactory since the value of the fmsd and the dipole moment reduce to 0.916 X au and 1.370 D, respectively. The latter two examples illustrate the relative unreliability of the constrained atom-centered charge models and the resulting necessity of including additional off-atom charges to mimic the electrostatic potential, or field, thoroughly. We therefore improved our method for sulfurantaining amino acid si& chains by adding extra charges on these molecules (see Figure 2). However, the scheme described above turned out to be insufficient to lower the value of the rmsd significantly. For this reason, we have chosen to represent the sulfur lone pairs by means of two dummy charges placed in such way that dsx H 0.65 A and LXSX N 135.5O. With this model, a significant decrease of the value of the rmsd is observed for the cysteine, cystine, and methionine side chains (see Table IV).

Conclusions One of the most problematic remaining issues is the discrepancy between potential-derived and field-derived net atomic charges. These charges should be strictly equivalent if the fit is performed on a continuum. The summation over a discrete series of points is the origin of the discnpancies encountered throughout this study. A low value of the rmsd for both sets of charges usually corresponds to similar charges. However, for large valuea of the rmsd, the sets of charges turned out to be relatively different. At this stage, one may wonder about the reliability of each set. An attempt to answer this critical question is proposed in Table V, where potential derived charges have been used to reproduce the electrostatic field and vice versa. The rms deviation was calculated in both cases for each amino acid side chain studied in this paper. In most cases, it seems that the two sets are suitable to describe both electrostatic potential and field. This important fact is confirmed by the general trend

Chipot et al.

10280 The Journal of Physical Chemistry, Vol. 96, No. 25, 1992

TABLE II: Cartesian Coordinatesa rad Potential- rad Fieid-Derived Net Atomic Chargesbof A m h Acid Side Chrb, Using the SpUt Valeace C31G** Basis Set ~~

X

Z

O.OO0

O.OO0

O.OO0

-0.118 1.019

0.191 -0.309

1.060 -0.200

-3.358 4.209 -3.433 -3.435 -2.057 -2.022 -2.020 -0.834 -0.841 -0.842

1.362 1.569 2.005 1.569 -0.080

-1.337 -1.733 -1.685 -1.733

-1.338 -1.730 -1.679 -1.730

1.154 1.517 1.523 1.517

-1 348 -0.900 0.900 1.848 -2.834 -1.398

1.972 2.822 2.041 2.031 0.67 1 0.614 0.619

H;

Y

-1.891 -2.785 -1.927 -1.927 -0.614 -0.590

0.218 -0.451 0.850 0.848 -0.583 -1.225 -1.224 0.329 0.960 0.959

-0.306 -0.909 0.561 -0.910 0.153

0.000 0.889 0.004 -0.893

-0.001 0.888 0.005 -0.895

0.020 0.523 -0,996 0.521

0.790 -0.491 -0.491 0.790 0.796 1.765

0.042 -0.632 0.689 0.668 -0.750 -1.414 -1.387

0.085 -0.542 0.731 0.732 -0.748 -1.407

-0.001 -0.003 -0.880 0.879 O.Oo0

0.876 -0.877 0.001 -0.881 0.885

O.OO0

0.879 0.000 -0.878 O.OO0

-0.005 -0,488 1.028 -0.482

-0,005 -0,490 1.030 -0.480

O.OO0

-0,886 -0.001 0.886

0.377 -0,492 0.492 -0.377 -0.072 0.252

-0.007 -0.003 0.859 -0.889 0.009 -0,851 0.889

O.OO0

0.000 -0.871 0.870 O.OO0

0.867

4v

QE

-0,561 0.140 0.140

Alanine; N,,,C = 2263 atom -0.570 H3 0.143 H4 0.143 rmsdC(10-3 au)

-0.425 0.138 0.116 0.116 0.203 0.007 0.007 -0,005 0.080 0.080

Arginine; Npnt= 4434 atom -0.455 Nl 0.150 H8 c 4 0.128 0.128 N2 0.133 N3 0.034 H9 0.034 HI0 -0,006 HI I 0.094 HI2 0.094 rmsd au)

-0.598 0.161 0.163 0.161 0.965

Asparagine; N,,, = 2926 atom -0.649 0 0.172 N 0.176 H4 0.172 H5 1.001 rmsd au)

-0.303 0.020 0.036 0.020

Aspartate, 6-31G**; Npnt= 2447 atom -0.438 c2 0.057 0, 0.070 0 2 0.057 rmsd (lo-) au)

-0.340 0.037 0.044 0.037

X

Y

Z

4v

-0.682 -0,219

-0.785 0.904

-0.304 -0.556

0.140 0.140 0.049

0.143 0.143 0.048

0.393 0.28 1 1.623 2.656 1.848 2.527 3.592 2.774 1.102

-0,482 -1.471 -0.01 1 -0.845 1.293 -1.830 -0.510 1.656 1.948

0.001 0.003

-0.488 0.322 0.909 -1.085 -0.902 0.507 0.515 0.466 0.438 0.219

-0.583 0.353 1.009 -1.099 -0.962 0.504 0.512 0.478 0.454 0.184

-0.392 -0.999 -1.961 -0.747

1.309 -0.843 -0,596 -1.801

O.OO0 O.OO0

-0,639 -1.112 0.454 0.445 0.175

-0.648 -1.138 0.459 0.453 0.136

0.216 0.743 0.742

0.000 -1.116 1.117

-0.012 0.003 0.003

0.935 -0,854 -0.854 0.325

0.965 -0.856 -0,856 0.195

O.OO0

-1.116 1.117

-0,010 0.002 0.002

1.075 -0.927 -0,927 0.416

1.110 -0.928 -0.928 0.260

Aspartate, 6-31++G**; N,,, = 2447 atom 0.209 -0.505 c 2 0.745 0.082 01 0.744 0.088 0 2 0.082 rmsd ( au)

O.Oo0

-0.001 -0.OO0

-0,002 -0.001 -0,002 0.002

0.OOO O.OO0

4E

-0.361 0.132 0.206 0.132

Cysteine; N,,,= 2234 atom -0.394 S 0.146 H4 0.207 rmsd au) 0.146

-0.660 -0.922

-0.087 1.215

0.000 0.000

-0.323 0.215 1.056

-0.313 0.209 0.633

-0.488 -0.085 -0.085 -0.488 0.214 0.182

Cystine; NPt = 2824 atom -0.514 H3 -0.080 H4 -0,080 H5 -0.514 H6 0.218 rmsd (10-3 au) 0.190

-1.933 2.834 1.398 1.933

0.554 0.796 1.765 0.554

1.428 0.072 -0.252 -1.428

0.177 0.214 0.182 0.177 0.803

0.186 0.218 0.190 0.186 0.529

Glutamine; N,,, = 3428 atom -0.325 c3 0.085 0 0.094 N 0.095 H6 -0.024 H7 0.038 rmsd au) 0.034

-0,561 -0.505 -1.743 -1.797 -2.582

0.139 1.336 -0.527 -1.515 0.005

0.001 0.005 -0.014 0.023 0.008

0.824 -0.623 -1.117 0.429 0.466 0.234

0.862 -0,624 -1.139 0.434 0.471 0.180

Glutamate, 6-31G**; N,,, = 2973 atom -0.246 H5 -0.590 -0.001 c 3 0.709 0, 1.733 0.050 0 2 0.592 0.050 0.077 rmsd 10-3 aul -0.044

-1.408 0.071 -0.619 1.299

-0.866

-0.085 0.812 -0,852 -0.824 0.322

-0,044 0.830 -0.855 -0.817 0.232

-0,244 0.057 0.067 0.067 0.07 1 0.004 O.OO0

-0.109 -0.043 0.006 0.006 0.175 -0.085

O.OO0 O.Oo0 O.OO0

The Journal of Physical Chemistry, Vol. 96, No. 25, 1992 10281

Modeling Amino Acid Side Chains TABLE II (Continued)

X

Y

Z

X

Y

z

-0.591 0.706 1.731 0.608

-1.399 0.068 -0.627 1.301

-0.866

-0.693 0.205 0.205 0.188 0.229 -0.478 0.247

Histidine D,N,,, = 3295 atom -0.655 N2 0.193 c4 0.193 H4 0.180 HS 0.237 H6 -0.475 rmsd au) 0.226

1.519 0.21 1 -0.102 2.300 -0.041

0.702 1.123 -1.984 -1.247 2.164

o.OO0

-0.902 0.254 0.254 0.210 0.549 -0.601 0.250

Histidine E N,,, = 3346 atom -0.803 N2 0.224 c 4 0.224 H4 0.185 H5 0.575 H6 -0.604 rmsd au) 0.215

-1.452 -0.176 -2.179 0.020 -2.300

0.584 1.097 -1.409 2.147 1.099

-0.564 0.199 0.199 0.213 0.259 -0.199 0.009

Histidine (Cation); NPt = 3541 atom -0.603 N2 0.208 c 4 0.208 H4 0.219 HS 0.303 H6 -0.258 Hl 0.050 rmsd (lo-’ au)

-1.453 -0.169 0.200 -2.229 0.016 -2.298

0.178 -0.056 -0.043 -0.199 0.038 0.042 0.039 0.178

Isoleucine; N,,, = 3656 atom 0.078 H6 -0.014 Hl 0.002 c 4 -0.264 H8 0.067 H9 0.065 H1o 0.067 rmsd au) 0.078

-0.581 0.132 0.132 0.144 0.551 -0.035 -0.581 0.132

Leucine; Npnt= 3596 atom -0,581 H6 0.132 Hl 0.132 c 4 0.145 H8 0.550 H9 -0.032 HI0 -0.58 1 rmsd ( au) 0.132

4v

4F

QV

UP

-0.104 0.899 -0.919 -0.887 0.416

-0,050

0.933 -0.927 -0.881 0.303

-0.573 0.046 0.380 0.121 0.123 0.391

4,559 0.026 0.380 0.128 0.126 0.303

O.OO0 O.OO0 O.OO0 O.OO0 O.OO0

-0).471 -0.242 0.123 0.185 0.390 0.385

-0.414 -0.315 0.134 0.202 0.378 0.278

0.608 1.126 -1.990 -1.368 2.178 1.139

O.OO0 O.OO0 O.OO0 O.OO0 O.OO0 O.OO0

-0.188 -0.228 0.379 0.263 0.259 0.398 0.204

-0.209 -0.227 0.395 0.255 0.258 0.401 0.171

-0.615 -1.197 -1.584 -2.582 -1.683 -1.183

0.624 1.548 -0.554 -0.460 -0.590 -1.507

1.394 0.039 -0.124 0.295 -1.206 0.204

-0.043 -0,056 -0,199 0.043 0.039 0.038 0.331

0.002 -0.014 -0.264 0.065 0.067 0.067 0.241

2.153 1.293 -1.279 -1.324 -1.330 -2.164

-0.285 -0.790 -0.697 -1.721 -0.726 -0,180

-0.263 1.182 0.096 -0.264 1.182 -0.263

0.132 0.144 -0.581 0.132 0.144 0.132 0.108

0.132 0.145 -0.581 0.132 0.145 0.132 0.081

-0.039 1.200 1.267 1.267 2.452 2.483 3.292 2.484

-0.967 0.552 1.176 1.175 -0,308 -0.906 0.252 -0.906

-0.876 O.OO0 -0.880 0.88 1

-0.016 0.350 -0.010 -0.010 -0.433 0.320 0.343 0.320 0.436

0.063 0.220 0.06 1 0.061 -0.627 0.386 0.399 0.386 0.286

-0.507 -1.975 -2.008

-0.628 0.429 1.052

O.OO0 O.OO0

-0.143 -0.643 0.199

-0.179 4.575 0.184

G.lutamate, 6-31G++G**; Npnt= 2978 -1.899 -2.783 -1.943 -1.943 -0.6 18 -0,592

-2.126 -2.494 -2.494 -2.555 -0.635 0.185 1.460

2.143 2.513 2.512 2.558 0.648 -0.096 -1.329

2.170 2.531 2.531 2.585 0.678 -0.126 -1.394

0.701 1.197 0.615 1.584 1.183 2.582 1.682 -0.701

0.036 -0.829 0.926 0.037 O.OO0 O.OO0 1.244 1.238

-2.585 -3.468 -2.630 -2.630 -1.321 -1.321 -1.321 -0.045

-0.039

Hi

2.177 2.953 2.314

0.088 -0.552 0.732 0.734 -0.740 -1.397

-0.025 -0.548 -0.548 0.969 0.070 -1.034 -0.583

-0,015 -0.539 -0.539 0.987 0.024 -1.133 -0.756

0.009 -0.509 -0,509 1.008 0.085 -1.047 -0.699

0.618 1.547 0.624 -0.554 -1.507 -0.460 -0.590 0.618

1.456 2.007 1.964 1.515 O.OO0 O.OO0

-0.759 -1.784

-0.329 0.298 -0.963 -0.966 0.529 1.177 1.178 -0.322 -0.967

-0.027 0.732 -0.648

0.001 O.OO0

-0,873 0.870 0.001 0.869

O.OO0 0.879 -0.879 O.OO0 O.OO0 O.OO0 O.OO0

O.OO0

-0.875 0.875 O.OO0 O.OO0 O.OO0 O.OO0

0.000 -0.881 0.88 1 O.OO0 O.OO0 O.OO0 O.OO0

-0.309 -0.039 -1.394 0.124 -0.204 -0.295 1.206 0.309

0.096 -0.263 -0.263 1.182 -0.373 -1.463 0.096 -0.264

O.OO0

-0,001 0.879 -0,877 O.OO0

-0.873 0.872 -0.OO0

0.876

O.OO0 O.OO0

-0.878

-0.046 -0.063 -0.012 -0.012 0.248 -0,104

-0.222 -0.007 0.044 0.044 0.1 17

atom H5 c 3

01 02

rmsd (10-j au)

-0.032 0.067 -0.016

-0.338 0.104 0.124

Methionine; Npnt= 3368 atom -0.461 S 0.129 C, 0.157 H’,

-0.032

-0,001 O.OO0

-0.050

Lysine; N,,, = 4157 atom -0.324 H1 0.120 c 4 0.094 HS 0.094 H9 0.086 N 0.018 HI0 0.018 HI I -0.122 H12 0.063 rmsd au)

-0.263 0.100 0.067 0.067 0.178

O.OO0

O.OO0 O.OO0

0.001 O.OO0

O.OO0

-0.814 O.OO0

0.8 13

0.885

10282 The Journal of Physical Chemistry, Vol. 96, No. 25, 1992

Chipot et al.

TABLE [I (Continued) X

2.313 0.801 0.685 0.685

-2.416 -2.792 -2.816 -2.816 -0.905 -0.192 -0,725 1.192

0.712 1.506 2.033 0.918 -0,055 -1.362 -1.900 -1.902 0.143

0.655 1.014 1.078 1.015

0.090 0.129 0.143 1.222 1.275

-2.631 -2.534 -3.635 -2.534 -1.603 -1.820 -0.627

0.400 -0.165 -2.750

2.880 3.275 3.271 3.275 1.369 0.639 1.153 -0.750 -1.292

O.OO0 O.OO0 O.OO0

1.273 1.317 2.162

Y

Z

-0.648 0.633 1.260 1.260

0.879 O.OO0 0.877 -0.877

O.OO0 -0,040 -0.857 0.896 0.001 -1.193 -2.129 -1.196

-1.074 -0.939 0.493 1.250 0.147 0.171 -1.039 1.149 -1.894

-0.020 -0.542 0.976 -0.545

0.552 1.272 1.113 -0.262 -0,869

-1.307 -1.943 -0.895 -1.943 -0.216 1.115 1.801 0.897 -0.385 1.647

0.015 -0.491 1.026 -0.493 0.014 1.190 2.136 1.184 2.116

0.585 1.238 1.238 -0.260 -0.901 0.364

0.008 1.028 -0.521 -0.453 -0.010 -0.008 -0.015 0.002

0.654 -0.556 -0.445 0.314 0.781 -0.008 -0.153 -0.412 0.671

O.OO0

0.884 -0.001 -0.882

0.045 -0.764 0.977 -0.109 0.610

O.OO0 -0.876 O.OO0 0.876 O.OO0 O.OO0 O.OO0 O.OO0 O.OO0 O.OO0

O.OO0 0.876 O.OO0 -0.875 O.OO0 O.OO0

O.OO0 O.OO0 O.OO0

O.OO0

-0.870 0.870 O.OO0

-0.876 -0,002

4v 0.124 -0.204 0.158 0.158

qE 0.157 -0,108 0.135 0.135

-0.559 0.153 0.148 0.148 0.365 -0.298 0.163 -0.112

Phenylalanine; NPlt= 3630 atom -0.570 HS 0.154 CS 0.151 H6 0.151 c6 0.384 H7 -0.317 c 7 0.166 H8 -0.097 rmsd ( au)

-0.667 0.145 -0.070 -0.049 0.057 0.755 -0.640 -0.604 0.334

Proline; Nplt= 3924 atom -0.7 19 H2 0.107 H3 -0.112 H4 -0.083 HS 0.110 H6 0.749 H7 -0.654 H8 -0.604 H9 0.360 rmsd ( au)

0.278 -0.027 0.036 -0.027

Serine; Npnt= 2464 atom 0 0.151 0.007 H4 0.07 1 rmsd (lo-’ au) 0.007

0.497 -0.012 -0.082 -0,736 0.423

Threonine; NPll= 2964 atom c2 0.409 0.026 H4 -0.044 H5 -0.722 H6 0.433 rmsd au)

-0.584 0.165 0.168 0.165 0.063 -0.154 -0.536 0.198 0.153 0.194

Tryptophan; Npnl= 4203 atom -0.576 H5 0.163 c6 0.165 H6 0.163 Cl 0.086 H7 -0.170 C8 -0,519 H8 0.226 c 9 0.094 H 9 0.196 rmsd (lo-’ au)

-0,576 0.158 0.148 0.158 0.288 -0.198 0.166 -0.407 0.188

Tyrosine; Npn,= 3800 atom -0,569 CS 0 0.156 0.147 H6 0.156 c6 0.290 Hl Cl -0.202 0.167 H8 -0.402 rmsd (1W3au) 0.184

0.385 -0.081 -0.08 1 -0.249 0.048 0.041

Valine; Npnl= 3252 atom 0.289 H5 c3 -0.034 -0.034 H6 -0.360 Hl 0.087 H8 0.074 mud (1W au)

H, H8

rmsd (

X

Y

Z

4v

4E

-2.840 -2.008

-0.221 1.OS2

O.OO0 -0.885

0.260 0.199 0.587

0.241 0.184 0.416

1.724 1.892 2.968 1.194 1.727 -0.191 -0.723

-2.132 -0.001 1.195 2.130 1.194 2.130

0.002 0.007 0.012 0.002 0.002 -0,008 -0.015

0.145 -0.201 0.152 -0,112 0.145 -0.298 0.163 0.105

0.142 -0.213 0.153 -0.097 0.142 -0.317 0.166 0.087

2.296 0.91 1 2.246 2.951 1.320 -0.360 -2.728

-1.681 -1 .OS8 0.929 0.501 1.973 1.780 0.293 -0.942

-0.580 -1.465 -1.414 0.131 -0.305 1.168 1.814 -0.605

0.028 0.017 0.025 0.052 0.050 0.039 0.073 0.454 0.582

0.054 0.028 0.045 0.072 0.058 0.053 0.072 0.462 0.356

-0,737 -1.139

0.120 -0.732

O.OO0 O.OO0

-0.685 0.426 0.542

-0.665 0.429 0.319

-1.208 -1.270 -2.066 -1.274

-0.237 -0.955 0.425 -0.781

-0.019 0.795 0.058 -0.955

-0.226 0.031 0.060 0.445

-0.370 0.086 0.076 0.106 0.273

-0,532 0.676 0.265 2.038 2.698 2.584 3.653 1.780 2.202

2.786 -1.501 -2.496 -1.312 -2.161 -0.019 0.101 1.097 2.086

O.OO0 O.OO0 O.OO0 O.OO0 O.OO0 O.OO0 O.OO0 O.OO0 O.OO0

0.410 -0.263 0.177 -0,207 0.156 -0.142 0.150 -0.28 1 0.169 0.121

0.402 -0.232 0.172 -0.217 0.155 -0.123 0.145 -0,301 0.172 0.119

-1.433 -2.784 -3.168 -0.722 -1.264 0.656 1.193

-0,017 -0.093 0.768 -1.211 -2.139 -1.185 -2.119

O.OO0 O.OO0 O.OO0 O.OO0 O.OO0 O.OO0

0.434 -0,638 0.449 -0.264 0.192 -0.278 0.179 0.115

0.442 -0.645 0.452 -0.262 0.193 -0.290 0.183 0.106

1.318 -1.273 -1.317 -1.318 -2.162

-0.899 -0.260 -0.901 -0.899 0.364

0.048 -0.249 0.048 0.048 0.041 0.307

0.087 -0.360 0.087 0.087 0.074 0.214

~~

~

au)

0.406

O.OO0

O.OO0

0.878 O.OO0

-0.876 0.878 -0.002

0.044

* In angstrom. In electronic charge units (ecu). Number of points requested for the fitting procedure. Indiccs given in Figure 1. e Root mean square deviation between the ab initio computed and the point charge derived electrostatic potentials and fields.

The Journal of Physical Chemistry, Vol. 96, No. 25, 1992 10283

Modeling Amino Acid Side Chains TABLE IIk Dipole Moment of Amino Acid Si& chrins from Potentirl- md Fkkl-Derived Net Atomic chugcs,Using tbe Split V&m 6.31C** h i s Set. Comparison with SCF a d Mulliken V.lUm

arginine asparagine aspartate, 6-31G** aspartate, 6-31++G** cysteine cystine glutamine glutamate, 6-3 lG** glutamate, 6-31++G** histidine D histidine E histidine (cation) isoleucine leucine lysine methionine phenylalanine proline serine threonine tryptophan tyrosine valine

5.769 4.069 3.805 4.356 1.846 2.351 3.884 5.542 6.181 4.039 3.530 3.026 0.047 0.098 9.073 1.801 0.288 1.434 1.852 1.798 1.885 1.429 0.057

5.782 4.064 3.799 4.348 1.812 2.362 3.888 5.509 6.149 4.037 3.538 3.025 0.057 0.093 9.067 1.780 0.290 1.455 1.837 1.792 1.893 1.435 0.055

5.783 4.06 1 3.793 4.335 1.783 2.391 3.891 5.494 6.133 4.040 3.537 3.025 0.069 0.090 9.07 1 1.800 0.292 1.466 1.835 1.793 1.893 1.439 0.060

4.904 4.025 3.810 4.759 0.540 0.698 3.8 13 5.463 6.696 3.21 1 2.480 1.699 0.046 0.006 8.085 0.218 0.393 1.398 2.399 2.388 0.762 2.398 0.021

TABLE I V Potentirl- and Field-Derived Net AtomiP Charges and Dipole Moment of Thee Sullur-Coatrining Amino Acid Side c h p i Including Off Atom Poht Charges QV

atomb C -0.617 -0.608 HI 0.167 0.164 H2 0.211 0.206 H3 0.167 0.164 S 1.935 2.008 atom C,

SI S2 C2

HI H2 H3 H4 atom C1 Hi H2 H3 c 2

H4

HS S

4v

4F

-0.576 -0.585 1.668 1.665 1.668 1.665 4.576 -0.585 0.185 0.188 0.148 0.150 0.160 0.163 0.185 0.188

Cysteine atom H4 X X' rmsde

au)

p(D)

Cystine atom H5 H6 XI X1' x2

xi

rmsd p(D)

au)

4F

0.043 0.036 -0.953 -0.984 -0.953 -0.984 0.217 0.313 1.770 1.769

0.148 0.160 -0.806 -0.779 -0,806 -0.779 0.170 2.387

0.150 0.163 -0.803 -0.778 -0.803 -0.778 0.134 2.378

Methionine atom -0.281 -0.347 0.075 0.096 0.081 0.106 0.081 0.106 0.119 0.036 0.020 -0.011 -0,011 0.020 2.050 2.034

c3

H6

H7 H8 X

X' rmsd ( p

(D)

au)

-0.697 -0.674 0.166 0.161 0.224 0.217 0.166 0.161 -0.981 -0.968 -0.98 1 -0.968 0.260 0.196 1.787 1.785

In electronic charge units (ecu). Indica given in Figure 2. Root mean square deviation between the ab initio computed and the point charge derived electrostatic potentials and fields.

of the dipole moment calculated from potential-, or field-, derived charges to be reasonably close to the corresponding SCF value. But the mmt critical aspect of the present survey concerns the charge transferability from one side chain to another structurally similar one, viz., asparagineglutamine, serintthreonine, or valintisoleUCine. It seems that the transferability is fairly acceptable when the effects of low-order multipolar moments are predominant. This phenomenon is well illustrated in molecules like asparagine or glutamine side chains, where the charges of the CONHz fragment are almost identical in both cases. However,

TABLE V Comparison between PoteoW-Derived md Field-Derived Net Atomic Charges of Amino Acid SMe Club by Mernr of an Analysis of Their Respective Root Meam Sptmre Devidom side chain a 1anine arginine asparagine aspartate cysteine cystine glutamine glutamate histidine D histidine E histidine (cation) isoleucine leucine lysine methionine phenylalanine proline serine threonine tryptophan tyrosine valine

Pv" ($1 2.04 13.70 4.00 5.85 1.61 0.62 11.11 14.91 1.02 2.60 3.43 7.85 2.78 16.97 4.94 2.86 8.08 3.51 7.42 7.44 5.22 11.73

(%) 0.00 7.60 2.21 7.18 0.79 0.19 7.22 10.78 0.33 2.16 2.34 7.05 1.23 14.69 2.64 1.15 5.06 3.45 7.69 2.52 1.89 11.68

PEb

" P V = W[lrmsdr(qv) - r m ~ r ( q ~ ) l l / m s d r ( q ~b l .~ = E 100 X [[lrmsd,(qd - rmsd,(qv)ll/rmsd,(q~)l.

the charges of molecules involving small electrostatic potentials, or fields, viz., valine or isoleucine, are practically not transferable. In such systems, the contribution of low-order multipolar moments is so weak that the description of the potential, or the field, at the monopole level is quite unsatisfactory. The derivation of net atomic charges from these quantities generally leads to very significant errors, therefore indicating the relative inadequacy of the present method for aliphatic side chains.

Acknowledgment. C.C. is indebted to Roussel Uclaf Company (France) for his doctoral fellowship. The authors wish to thank Drs. J. AngyBn, F. Colonna, G. Ferenczy, and C. Millot for many stimulating discussions. The computations presented in this paper were performed on the IBM-3090/600 of the Centre InterRegional de Calcul Electronique, Orsay (France). We thank the Groupment Scientifique "Mod6lisation Mol&ulairen IBM-CNRS for providing computer facilities. This work was also supported by the US. National Science Foundation (GrantsDMB-90-15815 and INT-91-15638).

References and Notes (1) Momany, F. A.; McGuire, R. F.; Burgess, A. W.; Scheraga, H. A. J. Phys. Chem. 1975, 79, 2361. (2) (a) Weiner, S.J.; Kollman, P. A,; Case, D. A.; Singh, U. C.; Ghio, C.; Alagona, G.; Profeta, Jr., S.;Weiner, P. J. Am. Chem. Soc. 1984,106,765. (b) Weiner, S.J.; Kollman, P. A.; Nguyen, D. T.; Case, D. A. J. Compur. Chem. 1986, 7, 230. (3) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.;Karplus, M. J. Compur. Chem., 1983,4, 187. (4) Hermans, J.; Berendsen, H. J. C.; van Gunsteren, W. F.; Postma, J. P. M. Biopolymers 1984, 23, 1513. ( 5 ) Hinze, J.; JafE, H. H. J. Am. Chem. Soc. 1%2,84,540. (6) Gasteiger, J.; Manili, M. Terrohedron 1980, 36, 3219. (7) Cox, S.R.;Williams, D. E. J. Compur. Chcm. 1981, 2, 304. (8) Guillen, M. D.; Gasteiger, J. Terrohedron 1983, 39, 1331. (9) Nalewajski, R. F. J. Am. Chcm. Soc. 1984,106,944. (10) Mortier, W. J.; Genechten, K. V.;Gasteiger, J. J. Am. Chem. Soc. 1985, 107, 829. (11) Chirlian, L. E.; Franc], M. M. J . Compur. Chem. 1987. 8, 894. (12) Williams, D.E.;Yan, J. M.Ada At. Mol. Phys. 1988, 23, 87. (13) (a) Kim, S.;Jhon, M. S.;Scheraga, H. A. J. Phys. Chem. 1988,92, 7216. (b) Wee, S.S.;Kim, S.;Jhon, M. S.;Scheraga, H. A. J. Phys. Chem. 1990,94, 1656. (14) Dinur, U.;Hagler, A. T. J. Chem. Phys. 1989,91, 2949. (15) Baler, B. H.; Merz, Jr., K. M.; Kollman, P. A. J. Compur. Chem. 1990.11.431. (16) Woods, R. J.; Khalil, M.; Pell, W.; Moffat, S.H.; Smith, Jr., V. H. J. Compur. Chem. 1990, 11, 297. (17) Breneman, C. M.;Wikrg, K. B.J. Compur. Chem. 1990,11, 361, (18) Ferenczy, G. G.; Reynolds, C. A,; Richards, W. G. J. CMpur. Chcm. 1990, l I , 159.

J. Phys. Chem. 1992,96, 10284-10289

10284

(30) Vign6Maeder, F.; Claverie, P. J . Chem. Phys. 1988,88, 4934. (31) Price, S. L.; Richards, N. G. J. J . Compur. Aided Mol. Des. 1991, 5, 41. (32) Frisch, M. J.; Head-Gordon, M.; Schlegel, H. B.; Raghavachari, K.;

(19) Luque, F. J.; Illas, F.; Orozco, M. J. Compuf.Chem. 1990,11,416. (20) (a) No, K. T.; Grant, J. A.; Scheraga, H. A. J . Phys. Chem. 1990, 94,4732. (b) No, K. T.; Grant, J. A.; Jhon, M. S.,Scheraga, H. A. J. Phys. Chem. 1990,94,4740. (21) Colonna, F.; Angyin, J. G.; Tapia, 0. Chem. Phys. Lefr. 1990,172, 55. (22) Ferenczy, G. G. J . Compyr. Chem. 1991,12, 913. (23) Colonna, F.; Evleth, E.; Angyin, J. G. Submitted to J . Compur.

Binkley, J. S.;Gonzalez, C.; Defrecs, D. J.; Fox, D. J.; Whiteside, R. A.; Seeger, R.; Melius, C. F.; Baker, J.; Martin, R. L.; Kahn, L. R.; Stewart, J. J. P.; Fluder, E. M.; Topiol, S.;Pople, J. A. GAUSSIAN 88; Gaussian Inc.: Pittsburgh, PA. (33) Williams, D. E. J . Compur. Chem. 1988, 9, 745. (34) Williams, D. E. Biopolymers 1990, 29, 1367. (35) Bernal, J. D.; Fowler, R. H. J . Chem. Phys. 1933, 1, 515. (36) Stillinger, F. H.; Rahman, A. J. Chem. Phys. 1974, 60, 1545. (37) Berendsen, H. J. C.; Poetma, J. P. M.; van Gunsteren, W. F.; Hermans,J. In Intermolecular Forces; Pullman, B., Ed.; Reidel: Dordmht, 1981; p 331. (38) Jorgensen, W. L.; Chandrasckhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. J . Chem. Phys. 1983, 79, 926. (39) (a) Matsuoka, 0.; Clementi, E.; Yoehimine, M. J. Chem. Phys. 1976, 65, 1351. (b) Lie, G. C.; Clementi, E.; Yoshimine, M. J . Chem. Phys. 1976,

Chem. (24) Aida, M.; Corongiu, G.; Clementi, E. Inf. J . Quantum Chem. 1992, 42, 1353. (25) (a) Hariharan, P. C.; Pople, J. A. Chem. Phys. Leu. 1972, 16, 217.

(b) Francl, M. M.; Pietro, W. J.; Hehre, W. J.; Binkley, J. S.;Gordon, M. S.;&Frees, D. J.; Pople, J. A. J. Chem. Phys. 1982, 77, 3654.

(26) Hehre, W. J.; Radom, L.; Schleyer, P. v. R.; Pople, J. A. Ab Znirio Molecular Orbital Theory; Wiley: New York, 1986. (27) Chandrasckhar, J.; Andrade, J. G.; Schleyer, P. v. R. J . Am. Chem. Soc. 1981, 103, 5609. (28) Mulliken, R. S.J . Chem. Phys. 1955, 23, 1833. (29) Stone, A. J. Chem. Phys. Lefr. 1981,83, 233.

64, 23 14.

Potential Energy Surface of Borazlrene (HCWBH) Ivan CernuSBk,t Sullivan Beck, and Rodney J. Bartlett* Quantum Theory Project, University of Florida, Gainesville, Florida 3261 1 (Received: April 6, 1992)

A study of the potential energy surface (PES)of general formula CHzBN is presented. A gradient HF/DZP geometry search yields 11 stationary points on the PES: six minima and five transition states. The most interesting feature of the PES is n the presence of a stable, potentially aromatic ring structure of form HCNBH analogous to the isoelectronic cyclopropenyl cation. No experimental evidence for the ring structure has been reported. Three isomerization reaction channels were investigated, the first includes BH2 migration, the second and third BH migration. For the stationary points the energies were evaluated at the MBPT(4) and CCSD + T(CCSD) levels. The barriers for the H2BCN (H2BNC)isomerization from the C, to the ring C, structures are considerably higher than those corresponding to HCN + BH (HNC + BH) addition, suggesting possible ring formation via a two-step addition reaction. Infrared spectra for the critical points and correlated n reaction profiles for various reaction channels are presented, including an MBFT(2) prediction for the HCNBH ring.

borirene (CH)2BHto be 197.1 kJ/mol, compared to the resonance energy of benzene 153.1 kJ/mol. Harland et a1.8 also found the cyclic form of CzNHz+to be more stable than its linear isomer. These findings support the assumption that the cyclic isomer of cyanoborane-borazirene (HCyBH)-will be a stable bound species. The main contribution to its thermodynamic stability can be expected from the delocalization of the two r-electrons donated from gorbitals of C and N atoms to the vacant porbital of the boron atom. Unlike HCYBH, any other three membered rings like HNYBH or HNYCH clearly have unstable formal structures. Another point of interest is the kinetic stability of the threemembered ring. This property is related to the barriers connecting the stationary points on the PES. The relative height of the barriers will be a critical factor when considering the possibility of the synthesis of borazirene. The present investigation is a continuation of the MBPT/CC studies on isomerization reactions of HCNBH3 and HNCBH3.9J0

Introduction We have presented a series of studies directed at the identification and characterization of metastable nitrogen-containing species. Previously, we have considered N3H31and N4 and N,? In this paper we focus on the potential energy surface and partitularly the potential cyclic species, HBNCH. C3H3+and C3H2 are the simplest examples of a molecule that satisfies the Hlickel 4n 2 rule of aromaticity. Boron and nitrogen, as the nearest neighbors of carbon in the periodic table, offer two possible functional group^ isoeleCtroni~With HC, H*B, and N. F r ~ m the, one can in principle create analogues by substituting these group into the carbon cation skeleton. The stability of the resulting rings clearly depends upon the balance of various factors: ring strain, lone pair repulsion, the presence of electron-Withdrawing/donating substituents, and the molecule's potential aromatic stability. n HBNCH is an interesting candidate for such a system. Its synthesis would likely involve the cyanoborane H2BCN or its isocyanoborane isomer (H2BNC). Only limited experimental information on related structures is a ~ a i l a b l e .Other ~ ~ three-membered aromatic rings containing solely boron or nitrogen as heteroatoms were studied more extensively by ab initio methods." To our knowledge there is no such study for the isomers of cyanoborane. Our interest in this class of compounds pertains to their stability or metastability. Krogh-Jespersen et alafaestimated the resonance energy for the related cyclopropenium (CH)2N+ion to be 290.8 kJ/mol and for

-

+

calculrtioas

A. Metbod& The MBPT and CC methods used in this paper were described in detail elsewhere (see, e.g., ref 11). In the following we limit ourselves to a brief definition of some energy terms used throughout this paper. All calculations were performed by using the ACES IlZ and ACES 1113program systems. The investigations of PES consisted of SCF gradient geometry optimizations and subsequent higher level singlapointenergy calculations for the minima and transition states. The geometry optimizationswere done by ushg the JODA

Permanent addrtss: Department of Physical Chemistry, Faculty of

Sdc.ncc, Comcnius University, MlynsM Doha CHI, 84215 Bratislava, Czech

and Slovak Federative Republic.

OO22-3654/92 , ,12096-10284S03.OO/O Q 1992 American Chemical Societv I

-