A revised empirical potential for conformational ... - ACS Publications

DAVID L. BEVERIDGE , MIHALY MEZEI , PREM K. MEHROTRA , FRANCIS T. MARCHESE , GANESAN RAVI-SHANKER , THIRUMALAI VASU , and S...
0 downloads 0 Views 1MB Size
Description of Revised EPEN Model

The Journal of Physical Chemktry, Vol. 82, No. 23, 1978 2497

A Revised Empirical Potential for Conformational, Intermolecular, and Solvation Studies. 1. Evaluation of Problem and Description of Model' Joseph Snir, Raphael A. Nemenoff,?' and Harold A. Scheraga*2b Department of Chemistiy, Cornell University, Ithaca, New York 14853, and Biophysics Department, WeizmannInstitute of Science, Rehovoth, Israel (Received December 2 1, 1977; Revised Manuscript Received August 9, 1978)

An empirical potential based on the interactions of electrons and nuclei (EPEN), and ultimately intended for use in the study of conformations of polypeptides in aqueous solutions, had been developed for saturated molecules. While successful for treating hydrogen-bonded configurations of water molecules, it leads to unphysical results for hitherto neglected high-energy repulsive configurations of water molecules. This failure arises because the potential had been parameterized on the set of low-energy water molecule configurations. In order to obtain a potential that can be used to treat water and organic molecules in a consistent way, the EPEN model has been changed by introducing a Buckingham potential between electrons,and by distinguishingbetween different "kinds" of electrons. In the parameterization, both the positions of the electrons and the parameters for the nonbonded potential are optimized to consistency with a set of experimental data. The quality of the experimental data used for this purpose is evaluated. The revised EPEN model (termed EPEN/B) can treat intramolecular and intermolecular interactions in a wide range of compounds. It has been extended to unsaturated hydrocarbons, carbonyl compounds, carboxylic acids, amides, amino acids, and peptides, and can be applied to aqueous solutions. By taking account of the conformational preferences of the molecules (in the locations of the electrons), the need for additional special functions such as torsional potentials is eliminated.

I. Introduction The use of empirically parameterized potential energy functions is widespread in physicochemical studies, involving calculations, e.g., of crystal structure^,^-^ vibrational spectra of single moleculesg and crystals,1° stable conformations of small and large m ~ l e c u l e s , ~ 'and - ~ ~the structures of liquids.l"lS In all these problems, one seeks an analytical representation of the energy of interaction between atoms within one molecule, or in different molecules. Because of the complexity of the systems involved, this analytical function cannot be derived from first principles, and one has to resort t o empirically parameterized potential functions. Most of the potential energy functions that are in use currently are atom-centered and based on one kind of experimental property. Limited attempts have been made t o combine several different types of properties in one p ~ t e n t i a l , ~but J ~this has been done only for a small class of systems. The ECEPP potentiallg (empirical conformational energy program for peptides), developed in this laboratory, was an attempt to combine data from crystal structure studies and conformational analysis. In order to study the conformational properties of organic molecules and polypeptides in aqueous solution it was necessary to develop a new potential, since the atomcentered ECEPP (though successful for nonaqueous systems) was unable to reproduce the structural and thermodynamic properties of liquid water. An empirical potential based on the interactions of electrons and nuclei (EPEN)20-22was, therefore, formulated (and applied initially to saturated molecules) in order to treat liquid water, conformational analysis, and crystal structure calculations with one model. However, because of the failure of EPEN in certain critical areas, as will be discussed here, it was necessary to revise and reparameterize this potential. Changes have been made in the EPEN model, and its range of applicability has been extended to unsaturated systems; therefore, we are presenting this work as an independent series of papers, and refer to the revised model as EPENIB. 0022-385417812082-2497$0 1.0010

During the analysis of the difficulties inherent in EPEN, many insights were gained into the nature of the problems and the limitations in the use of an empirical parameterized potential. The main aim of this introductory paper is to summarize these points. Section I1 includes a general discussion of the problems involved in deriving an empirical parameterized potential, and an analysis of the reasons for the difficulties with EPEN. Section I11 presents an evaluation of the sources of experimental data used in the parameterization. Section IV gives a description of the systems studied and outlines the structure of this series of papers. The general procedure for parameterizing and testing the potential is described in section V. In section VI, the revised model and analytical form of the function are presented. The last section includes a summary and conclusions. 11. General Discussion The use of empirical parameterized potentials is an attempt to approximate the Born-Oppenheimer energy surface for a molecule or system of molecules by an analytical function. The total energy can be written as

where EK is the kinetic energy, and V is the potential energy; p and q are vectors specifying the momenta and coordinates of all the nuclei and electrons in the system. Although, in principle, values of E can be obtained by solving the Schrodinger equation, this is not feasible for many-electron systems of current interest. In a classical statistical mechanical treatment of an ensemble of molecules containing atoms of given masses, integration over the momenta makes the kinetic energy a function only of temperature. At a given temperature, the kinetic energy may be assumed to be constant, independent of conformation or of interaction with other molecules. Differences in energy between different conformations, or between different molecular arrangements, arise entirely from differences in the potential V. 0 1978 American Chemical Society

2498

The Journal of Physical Chemistry, Vol. 82, No. 23, 1978

J. Snir, R. A. Nemenoff, and H. A. Scheraga

4.0

O

H

k

H

/

Figure 1. Configuration of water molecules for which EPEN fails to reproduce correct energy. R, , is the distance between oxygen nuclei, and Rp,.+ is the distance between the centers of gravity of the bonding and nonbonding electrons of each molecule.

Experimental data provide values of V for particular values of q. The aim of an empirically parameterized potential is to construct a model that will reproduce these values of V, and by induction be able to predict values of V for other values of q, and for related systems. The underlying assumption is made that a set of values of V can be traced to a series of additive specific interactions and that, if these interactions are described correctly by the analytical model, the function should be able to reproduce experimental data not used in the parameterization, where similar interactions are involved; Le., it is necessary that the function be applicable to new systems (transferability of the model is required). Additional assumptions are made in order to facilitate calculation of the energy, or because a particular potential function is designed to reproduce only a particular physical property. The dilemma faced in developing a new empirical potential is to find as simple a description as possible of a very complicated system. A limitation on the simplicity of the potential function is dictated by the nature of the systems and interactions for which the potential function is being designed. The range of applicability depends on how well the model chosen represents physical reality. It is to be expected that a simple model will be applicable only to systems very similar to those on which it was parameterized. To improve a potential, the arbitrary assumptions must be abandoned, thus bringing the model closer to reality, and making it applicable to a wider range of problems; this can be accomplished at the cost of increased complexity of the model with a possible attendant increase in the expense of the computations. EPEIP0-22was developed for studies o f the conformational behavior of molecules (including dipole moments and crystal structures), and of t h e interactions between t h e m and with the solvent. Molecules are represented as bodies containing a number of centers of interaction. The potential energy is approximated by a sum of pairwise interactions between the centers of interaction; three-body and higher order terms are not considered explicitly. To study the properties of liquid water, the potential function for the interaction of an ensemble of water molecules is represented as the sum of interactions between pairs of water molecules. This potential is employed in a statistical mechanical procedure such as Monte Carlo18,26*26 or molecular dynamic^.^^,^' Thus, for a proper description of the liquid, it is necessary that the entire water dimer potential energy surface be described correctly. EPEN, which was parameterized on the experimental properties of ice and the gas phase dipole moment,20provided a valid description of hydrogen-bonded but did not give a good description of the dimer

I 0

1.0

9.0

I

I

3.0

4.0

I

5.0

6.0

R O...0, A

Figure 2. Water dimer energy as a function of O-..Odlstance: curve 1, EPEN energy;*’ curve 2, energy using potential of Matsuoka et al.;” curve 3, EPEN/2 energy2’ (see paper 2’’ for parameterization of EPEN/2).

surface in many (repulsive) regions. In particular, for the arrangement of water molecules shown in Figure 1,EPEN predicted an attractive energy. Both quantum mechanical calculations1’ and chemical intuition predict that this arrangement will be repulsive in energy. The EPEN energy of interactionz0 as a function of O...O distance is shown for this arrangement in Figure 2, and compared with similar plots using either a potential for water developed by Matsuoka et al.17 or EPEN/2 (described in paper P9). In a classical statistical mechanical procedure, where configurations of water molecules are weighted by a Boltzmann factor, such nonphysical low energy regions (curve 1 of Figure 2) will result in a liquid with very different properties from those of liquid water. This was demonstrated with preliminary Monte Carlo calculations with EPEN.30 The problem arises because, in EPEN,20the attraction (centered on the oxygen nuclei) overwhelms the repulsions (centered on the electrons). The attractions can be weakened effectively by centering the attractions (as well as the repulsions) on the electrons; i.e., the effective distance between the centers of attraction is increased from Ro.. .o to Rp., .p (defined in Figure 1). As seen in curve 3 of Figure 2, this problem is overcome in EPEN/2, whose parameterization is described in paper !LZ9 Thus, the failure of EPEN to reproduce the structural and thermodynamic properties of water may be attributed to the location of the centers of interaction and to the method used for parameterization. The arrangements of water molecules found in ice are limited; all involve molecules in low-energy hydrogen-bonded configurations, unlike that shown in Figure 1. Mathematically, ice represents a very small subset of the domain of q, and it is not surprising that a function parameterized from this small subset does not give a good description for the whole domain of q. In order to improve the EPEN description of liquid water, it was necessary to examine other areas of the dimer potential energy surface. However, a more complicated potential function is required to reproduce the values in all regions of the dimer surface. Since EPEN was developed to provide a consistent model for both water and other molecules, the whole set of saturated molecules

Description of Revised EPEN Model

The Journal of Physical Chemktry, Vol. 82, No. 23, 1978 2499

for many systems of interest, such data are unavailable. A second problem is the fact that most of the molecules studied are small, with one or two bonds about which rotation can occur. The conformational preference of a particular functional group will be determined largely by the neighboring functional groups; therefore, little information about intramolecular interactions between groups separated by several bonds is obtained. While crystals (involving intermolecular interactions) are useful souces of data (see section IIIB), it is necessary to include conformational data (involving neighboring intramolecular interactions) in the parameterization of a potential function that is intended for use in conformational analysis. B. Crystallographic Studies. Large numbers of molecules have been studied in crystalline form, using either X-ray or neutron diffraction. Several other potentials have been developed using crystal data.3-8 By examining the structure of the crystal, information about intermolecular interactions is obtained in a direct way. For small molecules, the accuracy of the atomic positions of the heavy atoms is high; neutron diffraction enables accurate positions to be assigned to the hydrogen atoms. The general procedure used with empirical potential functions is to minimize the intermolecular energy with respect to the lattice constants and additional degrees of freedom.34 These minimum-energy lattice constants (and lattice energies; see section IIIC) are compared with the corresponding experimental values. Because of the large number of molecules which must be used to simulate a crystal, this is a very time-consuming procedure. A more fundamental limitation arises regarding the applicability of a potential function, derived exclusively from calculations of crystal structures, to the interaction between (even the same) molecules not arranged in a crystal. The arrangement of molecules in a crystal is that of lowest energy; higher-energy repulsive arrangements are not 111. Sources of Experimental Data observed. In terms of the domain of q, the vector of atomic In order to study conformational behavior in solution, positions, crystals always represent the small region of q it is necessary that the potential function used describe near a low-energy minimum. As demonstrated in the case both intramolecular and intermolecular interactions. The of water, a model which describes the energy surface for data for the parameterization and testing of EPEN have this small region will not necessarily be valid over the been chosen with this in mind. In this section, we present whole domain of q. If a potential is developed to calculate an evaluation of each type of data. crystal structures, then it is expected that a potential A. Spectroscopic Studies o n Single Molecules i n t h e parameterized on crystal data will be reliable. However, Gas Phase. Experimental values of molecular geometry, the application of a potential developed from crystal data dipole moments, barriers to rotation about single bonds, to represent a liquid, where the whole energy surface must energy differences between stable conformations, and be described, will probably lead to difficulties. dihedral angles of stable conformers can be obtained from In spite of these limitations, crystal structures are the measurements on free molecules in the gas phase. The best source of experimental data on intermolecular inprincipal technique used is microwave spectroscopy. teractions. We have used them in parameterizing and Additional information on molecular geometry is obtained testing EPEN, aware of the problems involved in modeling from electron diffraction studies. a liquid. In the case of liquid water, the intermolecular Using Kraitchman’s equation^,^^ the coordinates of the input data have been supplemented with quantum meatoms of a molecule can be calculated, given the rotational chanical energies of various configurations of the water constants for a series of isotopically substituted derivatives. dimer.29 This procedure gives bond lengths and bond angles as well C. Thermodynamic Properties of Crystals and Liquids. as dihedral angles. Rotational barriers can be computed from the splittings of the rotational levels;32however, the Experimentally measured differences in internal energies of crystals and liquids also provide information about theory has been developed only for systems with one or two rotors, and involves an assumption about the shape intermolecular interactions. For crystals, the heat of sublimation at 0 K is compared with the lattice energy as of the internal rotational potential function. Dipole moments are obtained from Stark effect measurements on calculated using a potential function. These two quantities the rotational spectrum. A discussion of the experimental are not equivalent; Shipman et al.35 have discussed the difficulties in these techniques is given by S c h ~ e n d e m a n ~ ~ difference between the two, and the correction terms which and D r e i ~ l e r . ~ ~ must be added to the experimental heat of sublimation in From the point of view of parameterizing a potential order to obtain an experimental lattice energy. It is function, such data are the only experimental source of necessary to have accurate values of the heat capacities information on intramolecular interactions. Unfortunately, of both the crystal and gas phase of the molecule, as well parameterized previouslyz0also had to be reparameterized. The lesson learned from this failure of EPEN is that the complexity of the interacting systems implies that a potential function developed by examining a limited domain of molecular arrangements will not necessarily be applicable to other arrangements. Before developing a new potential function, it is important to examine the domain of q which pertains to the physical problems of interest. The reliability of the potential function will be improved if experimental data from all domains of interest are incorporated. In this work, we have attempted to use data from crystal, single molecule gas phase, and liquid studies in order to examine a larger domain of molecular arrangements. In EPEN, transferability is achieved by assembling a molecule from molecular fragments.z0 A fragment should preserve its identity in different systems. The choice of chemical functional groups as fragments is reasonable, since the chemical behavior of a functional group is usually not affected greatly by the rest of the molecule to which it is attached. Similarly, the energy of interaction of two fragments in the EPEN potential is independent of other fragments present in the system. The fragments parameterized are those necessary to construct a protein molecule; the particular systems studied are presented in section IV. The validity of EPEN, or any potential function, is tested by its ability to reproduce independent experimental quantities not used in the parameterization. Although the choice of a model, functional form, and division into fragments are arbitrary to a great extent, physical and chemical knowledge should be a guideline. Individual terms of the potential function do not have physical significance; however, a potential that accurately reproduces a large amount of experimental data in some way must be depicting the true physical nature of the system.

2500

The Journal of Physical Chemistry, Vol. 82,No. 23, 1978

as intermolecular and intramolecular vibrational frequencies. The error involved in neglecting these correction terms was approximately 10% for the crystals examined by Shipman et We have, as an approximation, compared lattice energies directly to heats of sublimation, where experimental values for lattice energies do not exist. For liquid water, one can calculate the internal energy by use of a Monte Carlo procedure. To compare this value with the experimental value of the internal energy, it is necessary to correct for quantum effects. This procedure involves the calculation of the zero point intermolecular vibrational energy, and the change in zero point intramolecular energy between liquid water and individual water molecules a t infinite separation. Details are given elsewhere.18J6 The internal energy of the liquid is an average over all orientations of water molecules, and therefore a useful source of information for testing a potential. D. Quantum Mechanical Calculations. As mentioned above, for large regions of the energy surface in molecular systems there are no experimental data available. In order to obtain information about these regions, the use of quantum mechanical calculations is becoming more ~ i d e s p r e a d The . ~ ~main ~ ~ ~advantage of these calculations is that data can be obtained over the entire domain of q. Matsuoka et have performed ab initio calculations on water dimers, using an extended basis set with configuration interaction, to develop a potential for liquid water. Since their potential was parameterized with data distributed over representative dimer configurations, it was possible to reproduce the structural and thermodynamic properties of the liquid in a Monte Carlo procedure.18~*5,26 The limitations of this technique are the accuracy of the basis set used and the cost of the calculation itself. For larger molecules such as amides or carboxylic acids, the size of the basis set required to obtain reliable results make the calculation very costly. In spite of this limitation, it seems that, if one is interested in studying liquids, it is necessary to examine high energy regions of the energy surface using theoretical calculations, since the structure of a liquid is determined largely by short-range repulsion~.~~

IV. Survey of Systems Studied and Outline of Series of Papers Since this potential is being developed for conformational analysis of polypeptides, the fragments which must be parameterized are those needed to assemble a polypeptide chain molecule. In addition, parameters must be provided for studying the hydration of the polypeptide in aqueous solution. In the second paper of this series,29water is parameterized in a revised EPEN model (viz., EPEN/2). Saturated hydrocarbons, alcohols, and amines, for which parameters were developed previously,20are also reparameterized. The third paper40 exends EPEN to unsaturated hydrocarbons, dealing with isolated double bonded systems, conjugated chain hydrocarbons, and aromatic hydrocarbons. Paper 441presents results of parameterization and testing for carbonyl compounds and carboxylic acids. Paper 542deals with amides, amino acids, and peptides. V. Procedure for Parameterization and Testing The general procedure used in parameterizing EPEN is to form the square of the deviations of a set of calculated properties from their experimental values. The sum of these square deviations is then minimized with respect to the adjustable EPEN parameters, using the algorithm of Certain quantities are weighted to give them a

J. Snir, R. A.

Nemenoff, and H. A. Scheraga

greater importance in determining the fit. The magnitudes of the weighting factors are determined by the size of the absolute deviations from experiment, and by the sensitivity of the calculated quantity to changes in the adjustable EPEN parameters. For free molecules, the properties calculated were dipole moments, dihedral angles of stable conformations, barriers to rotation about single bonds, and relative energies of stable conformers. The dihedral angles and relative energies of stable conformers were calculated by minimizing the EPEN intramolecular energy as a function of all dihedral angles, using the algorithm of Barriers for rotation of methyl groups were calculated as the difference between the maximum and minimum energies as a function of the single dihedral angle about the bond attached to the methyl group; tests performed by varying all dihedral angles gave the same barrier heights, and therefore the single rotor approximation seemed justified. Pople and RadomU have discussed the coupling of rotors in studies of rotational barriers. They concluded that, for nonpolar functional groups such as alkyl groups, the rotors are uncoupled, to a good approximation, but that coupling becomes important for polar groups. Therefore, in calculating the barrier heights for rotation of methyl groups the variation of the rotational position of only the methyl group would be a valid procedure. Calculations with EPEN/229i40confirm this. For crystals, the lattice energy was calculated as the painvise sum of interactions between all points in a central molecule, and its surrounding molecules. A version of the crystal packing program PCKGof Williams,45adapted for use with EPEN/2, was used; this program employs the accelerated convergence technique for lattice sums, which involves the use of a convergence function. The lattice sum is split into two terms, one of which is evaluated a t short distances in real space, and one of which is evaluated a t large distances in reciprocal space. This technique, which assured convergence of the electrostatic energy, is of particular advantage for polar molecules. Without this technique, it has been shown that, for certain crystals, a cutoff distance as large as 50 A does not assure convergence of the electrostatic energy.& During parameterization, the lattice energy was minimized with respect to the unit cell structure (unit cell lengths and angles) without relaxing the symmetry of the unit cell. It is necessary to recalculate the lattice sums with every trial set of EPEN/2 parameters. In the past, several approximation methods have been used to avoid this costly r e c a l ~ u l a t i o n .With ~ ~ ~ the accelerated convergence technique, the lattice sums can be computed with a relatively small number of interactions, thus enabling reminimization of the lattice energy in a reasonable amount of computer time. The final parameterized potential was tested on additional free molecules and crystals. For free molecules, a search was made for zero gradient points on the energy surface using a modified Newton-Raphson procedure,&,47 and relative energies and structures were compared with experiment. For crystal studies, energy minimization with respect to those degrees of freedom needed to define the crystal, without changing the symmetry, was carried out. The ideal test of a potential function is its ability to reproduce a correct experimental structure when the energy of an ensemble of n molecules, each consisting of m atoms, is minimized with respect to the 3nm - 6 independent degrees of freedom of the system. The number of variables for such a system of many molecules is impractically large for computations with present-day computers. In addition,

Description of Revised EPEN Model

the ( 3 n m - 6)-dimensional energy surface consists of many local minima (the so-called “multiple-minimum” problem) and most currently available minimization algorithms cannot start with an arbitrary configuration, and pass from one potential well to another of lower energy. In order to place less stringent requirements on the generality of the potential, to reduce the number of variables to a reasonable size, and to assure that the energy minimization is carried out within the correct potential well (corresponding to the observed crystal structure), resort is had to less general procedures, involving a smaller number of degrees of freedom and a proper choice of starting configuration for energy minimization. The restriction on the number of degrees of freedom, that is used most often, is the assumption that the individual molecules in the ensemble behave as rigid bodies with no internal motions. This reduces the number of degrees of freedom from 3 n m - 6 to 6(n - 1). However, since this reduction by itself is not sufficient to reduce the number of variables to a practical size or (if the starting point is arbitrary) to overcome the multiple-minimum problem, the number of degrees of freedom is reduced further by assuming that the ensemble of n molecules forms a symmetrical structure (Le., a crystal) that can be specified by symmetry operations involving the translation of a unit cell, and the relative positions and orientations of the k molecules within each unit cell. Under such a restriction the number of degrees of freedom is reduced to 9 6 ( k - l),with k usually taken fixed at the observed value. For example, as shown by Hagler and Lifson,%these can be the components of each of the three vectors required to define the unit cell, and the six external degrees of freedom of (k - 1)molecules within the unit cell, with no assumptions made about the symmetry within each unit cell; Le., the asymmetric unit is considered to be composed of the k molecules in the unit cell. For the special case of k = 1, the number of degrees of freedom reduces to 9. If, as an apparently less general procedure, the symmetry of the crystal is imposed (Le., if the positions and orientations of the molecules within the unit cell are defined by a fixed set of symmetry operations carried out on the asymmetric unit, which now contains less than the k molecules in the unit cell), then the number of degrees of freedom is reduced to a maximum of 12, for a crystal with one molecule per asymmetric unit and more than one molecule per unit cell. Thus (assuming one molecule per asymmetric unit), for k > 1,these 12 degrees of freedom can be chosen as the six unit cell parameters ( a , b, c, a, 6, and y), and the three translational and three rotational degrees of freedom of the asymmetric unit, or as the nine components of the unit cell vectors, plus the translational degrees of freedom of the asymmetric unit. For k = 1,the three translational degrees of freedom are not applicable; i.e., there are then nine degrees of freedom. If the asymmetric unit contains more than one molecule, then six more degrees of freedom are required for the translational and rotational motion of each additional molecule in the asymmetric unit. In principle, while it appears that one can use 9 + 6(h - 1) degrees of freedom to “derive” the symmetry, in practice such a calculation does not start with an arbitrary configuration of the n molecules, but instead starts with a configuration close to, or at the observed symmetry. Therefore, even though 9 + 6 ( h - 1)variables are allowed to change in the calculation, one will arrive at the observed symmetry if one starts near it, unless the potential function is an extremely bad one. The use of a starting point near

+

The Journal of Physical Chemistry, Vol. 82, No. 23, 1978 2501

the observed symmetry effectively reduces the number of degrees of freedom to a maximum of 12 (for k > 1) or a maximum of 9 (for k = l),for a crystal with one molecule pler asymmetric unit. The number 12 is a maximum, and the actual number is less for crystals of high symmetry (e.g., for a cubic crystal a = b = c, and 01 = 0 = y = goo). In testing the EPEN/2 potential we have used either 15,12, or 9 degrees of freedom, depending on the number of molecules per unit cell and the number of molecules per asymmetric unit. In certain cases, the computer time required for energy minimization with even this number proved excessive, and a smaller subset was used. We are aware that, as pointed out by Hagler and L i f ~ o na, ~ more ~ complete test requires that the full number of degrees freedom be used.

VI, Model and Potential Function (EPEN/2) In revising the EPEN model, we have retained as much as possible from the original work.20 A molecule is considered to be composed of centers of interaction which carry charge. Positive charges are located on the atomic nuclei, which carry a charge of Z - 2, where Z is the atomic number, except for hydrogen which carries a charge of +1. Negative charge centers are located off the nuclei; since charge must be conserved within a molecule, these charges have magnitudes of -1 or -2 au corresponding to single electrons or electron pairs, respectively; the bonding electrons are located along the line joining two atoms, and the lone-pair electrons or T electrons are located off this lime. In the original EPEN work,20-22the connection between the position of the negative charge centers and the real electrons of the molecule was pointed out. It should be emphasized that the negative charge centers must be regarded as centers of interaction, which need not represent the actual electron distribution. The amount of electronic charge located in the chemical bonds, as cadculated quantum mechanically, may differ from the amount of charge placed there49in the EPEN/2 model. For convenience, however, the negative charge centers will still be referred to as “electrons”. Since these “electrons” represent charge distributions which differ for different functional groups, we shall use the term “different types of electrons” to distinguish between these different charge distributions, even though electrons are indistinguishable arid there is really only one type of electron. Bond lengths and bond angles are kept fixed. Since EIPEN/2 has been formulated for application t o a polypeptide, it is necessary to keep the bond lengths and bond angles fixed in order to limit the number of variables in the calculation of the empirical energy. Although the use of fixed bond lengths and bond angles is an oversimplification for many molecules, the intended application of EPEN/2 to polypeptides makes this approximation necessary and j~stifiab1e.l~Wherever possible, experimental bond lengths and bond angles were used; in cases where no experimental geometry has been measured, a sttindard set of bond lengths and bond angles was used.22 In certain cases, difficulties arose when large changes in bond lengths and bond angles are observed in two conformations of the same molecule; these cases will be discussed in subsequent papers. Molecules are assembled from fragments, and the fragments are defined as previously.20 As before,20the distances between the electrons and the heavy atoms are fixed in the fragments. The positions of the electrons in the fragments are located by fitting to experimental data. Since only molecules without strain are considered here, the fragments (and the electron positions and nonbonded interaction’parameters within a fragment) are assumed to

2502

The Journal of Physical Chemistry, VoL 82, No, 23, 1978

be transferable from molecule to molecule. There is a Coulombic interaction between all point charges of the form 9241 Ecoul = 332.0719 Ckcal/mol (2) 1J

rlJ

where q1 and qJ are charges in atomic units, rlJ is the distance between ql and qJ in Angstrom units, and 332.0719 in units of kcal/mol. is the conversion factor to give Ecoul In addition, there is a nonbonded interaction between all electrons of the form ENB= C q l q J ( A I J e - B -~ ~C,/r,;) r ~ ~ kcal/mol (3) 11

electrons

Formerly,20 the interaction between electrons was represented by a repulsive exponential term; in addition there was an attractive nonbonded interaction between all heavy atom nuclei. The complete nonbonded function is now centered on the electrons, as indicated in section 11. Although this approach was explored earlier20 for calculation of crystal lattice energies, but not adopted, it was chosen here in order to treat repulsive interactions in water (see section 11). Each type of electron is assigned a set of A, B, and C nonbonded parameters. For the interaction between electrons of type i and type j , the combination rule used is Aij = (AiiAjj)1’2 (44 BLJ

=

+

(Bll

BJj)/2

(4b)

c, = ~ ~ 1 1 ~ , 1 ~ ” 2 (4c) The EPEN/2 energy between two molecules (Le., q1and qJ not in the same molecule) corresponds to the intermolecular potential energy relative to molecules at infinite separation. However, within a molecule (i-e.,q1 and qj in the same molecule), differences in the EPEN energy correspond to differences in intramolecular energy, but absolute values of the EPEN/2 intramolecular energy do not have physical significance. The use of the same set of parameters for dealing with both inter- and intramolecular interactions was discussed in the original EPEN work,22and therefore will not be dealt with here. The variable parameters which are adjusted during parameterization are the position of each type of electron and its set of nonbonded parameters. In the original parameterization of EPEN,2O the positions of the electrons were determined primarily by fitting to experimental dipole moments (with some input from torsional energies of carbon fragments) p = 4.80325CqlRl (debye) (5) 1

where R, is the vector position (in Angstrom units) of point charge i (all nuclei and all electrons), and p is the dipole moment vector. The nonbonded parameters were then derived by fitting to single-molecule conformational data and crystal data. It is our experience that the calculated crystal properties are sensitive to the positions of the electrons, and therefore the procedure adopted here was to allow all adjustable parameters for a particular functional group to vary simultaneously, and to be fitted to all of the data to be used. There are several advantages that the EPEN model possesses compared with atom-centered potentials. The ability to describe the interaction between two atoms by a sum of interactions between different centers, instead of by a central force between the two atoms, gives the potential the flexibility to deal with complex systems. Furthermore, the electrons have to be placed in such a way

J. Snlr, R. A. Nemenoff, and H. A. Scheraga

so as to reflect the symmetry of the molecule. Thus, the symmetry is implicit in the model and need not be accounted for by special functions such as torsional terms in the p0tentia1.l~In addition, by distinguishing between different types of electrons, the potential can be refined without changing the model. This will be demonstrated in the case of water29where, by distinguishing between the bonding electrons and the lone-pair electrons, and by assigning each an independent set of nonbonded parameters, we were able to overcome many of the difficulties described earliera20

VII. Summary The EPEN potential, which was designed originally to study the conformational behavior of organic molecules and polypeptides in aqueous solution, did not provide a qualitatively correct description of liquid water. This was due to the fact that it had been parameterized by examining only specific low-energy arrangements of water molecules. In improving EPEN, or in developing any potential function, it is necessary to decide to which physical problems the potential is to be applicable. Secondly, the important interactions (that influence the property computed) should be determined, as well as how much of the multidimensional energy surface needs to be described. The larger the portion of the energy surface to be described, the greater the demands on the potential, and the more sophisticated the model required. A constraint in developing a potential is the type of data available for parameterization and testing. Selection of a large variety of properties will give the potential a greater range of applicability. However, for many regions of the energy surface, no experimental data exist; resort to quantum mechanical results gives information about these regions, but reliable calculations are costly. The EPEN potential has been improved by changing the model; a complete Buckinghamso (exp 6) nonbonded potential acts between negative charge centers. These centers, although called electrons, should not be thought of as physically representing the electron distribution. Different types of centers are distinguished by parameters determining their position and nonbonded interactions. For a particular functional group, parameters are derived by fitting to a series of free molecule and crystal properties. Since optimization is carried out over a wide range of properties, the potential function may not be as accurate for a specific property as a potential developed to treat only that property. For example, EPEN/2 may not reproduce crystal lattice constants as accurately as a potential developed only for crystal studies. However, EPEN/2 can describe the conformational properties of the same molecule as well. Since conformational calculations are used in determining the parameters, limitations in the crystal calculations (e.g., those arising from the use of an incomplete set of variable degrees of freedom) will result in a potential function that nevertheless still is capable of treating conformational problems. T h e sacrifice of some accuracy is compensated for by the fact that the potential can be applied to a wider range o f problems. EPEN/2 was not parameterized on data from liquids. Therefore, application to liquids makes additional demands on the potential and also provides a test of the applicability of the potential function. The symmetry of a system is implicitly “built in” by placing the electrons in accordance with experimental symmetry. This eliminates the need for special functions such as torsional potentials. Also, no special function is required for hydrogen bonding systems.

Description of Revised EPEN Model

In subsequent p a p e r ~ , ~ ~this , ~ Omodel - ~ ~ and procedure are applied in parameterizing the fragments necessary to construct a polypeptide molecule. A large number of saturated and unsaturated molecules are examined, both as free molecules and in interactions with other molecules, to endow the potential function with greater reliability for a broad range of systems.

Acknowledgment. We are indebted to Dr. D. E. Williams for providing us with his crystal packing program and to Drs. J. C. Owicki and L. L. Shipman for helpful critical comments on these five manuscripts.

References and Notes (1) This work was supported by research grants from the National Science Foundation (FCM75-08691), and from the National Institute of General Medical Sciences of the National Institutes of Heab, US. Public Heakh Service (GM-14312). (2) (a) National Research Council of Canada Predoctorai Fellow, 1974-1977. (b) To whom requests for reprints should be addressed, at Cornell Universitv. (3) A. I. Kitaigorodskii, K. V. Mirskaya, and V. V. Nauchitel, Sov. fhys. Crysfal/ogr.,14, 769 (1970). (4) D. E. Williams, Acta Crysfallogr., Sect. A , 28, 629 (1972). (5) A. T. Hagler, E. Huler, and S.Lifson, J. Am. Cbem. Soc.,96, 5319 (1974). (6) A. T. kgler and S.Lifson, Acta cr>lstal&r., Secf. 6,30,1336 (1974). (7) F. A. Momany, L. M. Carruthers, R. F. McGuire, and H. A. Scheraga, J. fbys. Cbem., 78, 1595 (1974). (8) F. A. Momany, L. M. Carruthers, and H. A. Scheraga, J. fbys. Cbem., 78, 1621 (1974). (9) S. Lifson and A. Warshel, J. Cbem. fbys., 49, 5116 (1968). (10) A. Warshel and S. Lifson, J . Cbem. fbys., 53, 582 (1970). (11) G. N. Ramachandran and V. Sasisekharan, Adv. Protein Chem., 23, 283 (1968). (12) H. A. Scheraga, Adv. fbys. Org. Cbem., 6, 103 (1968). (13) H. A. Scheraga, Cbem. Rev., 71, 195 (1971). (14) N. L. Alilnger, M. T. Tribble, M. A. Miller, and D. H. Wertz, J . Am. Cbem. Soc.. 93. 1637 (1971). (15) A. Ben-Naim and’F. H. Siillinger in “Water and Aqueous Solutions”, Wiley, New York, N.Y., 1972, p 295. (16) F. H. Stillinger and A. Rahman, J. Cbem. fhys., 80, 1545 (1974). (17) 0. Matsuoka, E. Clementi, and M. Yoshimine. J. Cbem. f b v s . . 64. 1351 (1976). (18) J. C. Owicki and H. A. Scheraga, J . Am. Cbem. Soc., 99, 7403 (1977). (19) F. A. Momany, R. F. McGuire, A. W.Burgess, and H. A. Scheraga, J . fbys. Chem., 79, 2361 (1975).

The Journal of Physical Chemjstry, Vol. 82, No. 23, 1978 2503

(20) L. L. Shipman, A. W. Burgess, and H. A. Scheraga, froc. Nafl. Acad. Sci. U.S.A .. 72. 543 11975). (21) A. W. Burgess, L.’L. Shipman,and H. A. Scheraga, froc. Nafl. Acad. Sci. U.S.A., 72, 854 (1975). (22) A. W. Burgess, L. L. Shipman, R. A. Nemenoff. and H. A. Scheraaa, J . Am. chem. Soc., 98, 23 (1976). (23) P. J. Flory, Macromolecules, 7, 381 (1974). (24) N. G6 and H. A. Scheraga, Macromolecules, 9, 535 (1976). (25) G. C . Lie and E. Clementi, J. Cbem. fbys., 82, 2195 (1975). (26) G. C. Lie, E. Ciementi, and M. Yoshimine, J. Cbem. fbys., 64, 2314 (1976). (27) F. H. Stillinger and A. Rahman, J. Chem. fbys., 57, 1281 (1972). (28) J. C. Owicki, L. L. Shipman, and H. A. Scheraga, J . fhys. Chem., 79, 1794 (1975). (29) R. A. Nemenoff, J. Snir, and H. A. Scheraga, J. fbys. Cbem., paper 2, in this issue. (30) J. C. Owicki, R. A. Nemenoff, J. Snir, and H. A. Scheraga, unpublished results. (31) J. Kraitchman, Am. J . fbys., 21, 17 (1953). (32) R. H. Schwendeman in “Critical Evaluation of Chemical and Physical Structural Information”, National Academy of Sciences, Washington, D.C., 1974, p 94. (33) H. Dreizler, ref 32, p 352. (34) A. T. Hagler and S. Lifson, J . Am. Cbem. SOC.,96, 5327 (1974). (35) L. L. Shipman, A. W. Burgess, and H. A. Scheraga, J. fbys. Chem., 80. 52 - - 119761. - -, (36) A. T. Hagier, H. A. Scheraga, and G. NBrnethy, J. fbys. Cbem., 76, 3229 (1972). (37) E. Clementi, F. Cavallone, and R. Scordamagiia, J. Am. Cbem. Soc., 99. 5531 11977). (38) S.kwaminathan, R. J. Whitehead, E. Guth, and D. L. Beveridge, J. Am. Cbem. Soc., 99, 7817 (1977). (39) 6.Widom, Science, 157, 375 (1967). (40) R. A. Nemenoff, J. Snir, and H. A. Scheraaa, - J. fhvs. Chem.. .paper . 3, in this issue. (41) R. A. Nemenoff, J. Snir, and H. A. Scheraga, J. fbys. Chem., paper 4, in this issue. (42) J. Snir, R. A. Nemenoff, and H. A. Scheraga, J. Pbys. Chem., paper 5, In this issue. (43) M. J. D. Powell, ComDut. J., 7, 155 (1964). (44) J. A. Popie and L. Radom in “The Jerusalem Symposia on Quantum Chemistry and Biochemistry”, Vol. V, E. D. Bergmann and 6.Pullman, Ed., Jerusalem, 1972, p 747. (45) D. E. Williams, Acta Crysfallogr., Sect. A., 27, 452 (1971). (46) J. S. Fu and H. A. Scheraga, unpublished results. (47) J. Raphson, “Analysis Equationum Universalis”, 1690, cited by T. Cajori, “A History of Mathematics”, Macmillan, New York, N.Y., 1919, p 203. (48) P. Macdonald, ”Mathematics and Statistics for Scientists and Engineers”, Nostrand, London, 1966. p 75. (49) A. T. Hagler, prlvate communication. (50) R. A. Buckingham, froc. R. Soc.London, Ser. A , 168, 264 (1938).

-

--.

%