Parametric and Molecular Structural Relationships ... - ACS Publications

Sep 1, 1994 - Parametric and Molecular Structural Relationships of Dipeptides. Barbara A. Reisner, Brian Fitzsimmons, and Herschel Rabitz*. Department...
0 downloads 0 Views 4MB Size
J. Phys. Chem. 1994, 98, 11204-11212

11204

Parametric and Molecular Structural Relationships of Dipeptides Barbara A. Reisner, Brian Fitzsimmons, and Herschel Rabitz* Department of Chemistry, Princeton University, Princeton, New Jersey 08544

Tom Thacher and Jodi R. E. Fisher Biosym Technologies, Inc., San Diego, California 92126

Chung Wong Department of Physiology and Biophysics, Mt. Sinai School of Medicine of the City University of New York, New York, New York 10029 Received: February 16, 1994; In Final Form: July 7, 1994@

A principal component analysis is applied to the study of dipeptides corresponding to the 20 naturally-occumng amino acids and Avian polypeptide, APP, to quantitatively assess (1) force field parametric and (2) structural relations. The parametric principal component analysis has provided insight into the relationship between the molecular structure and the potential energy function. The molecular structures are most sensitive to the bond and angle reference parameters. The nonbond Parameters also significantly influence molecular structure. The structural space of the dipeptides and APP is very closely linked to the parameter space of the potential. The bond length and bond angle internals are tightly controlled by the bond and angle reference value parameters. Torsional motion is the most sensitive degree of structural freedom and is generally controlled by nonbonded influences. In the second part of the research, principal component analysis of the molecular structure Green’s function matrix provides a means to elucidate how the molecular structure will respond to internal forces introduced in the molecule. The eigenvectors of the Green’s function matrix corresponding to the maximum eigenvalues determine the internal coordinates responsible for the largest molecular responses. It is found that the torsional degrees of freedom are responsible for the greatest molecular response. To complement the analysis of APP, the sensitivities of centroids associated with each residue to changes in the backbone internals were determined. The coefficients indicate the local molecular response and were found to be in agreement with previous characterization of regional flexibility.

I. Introduction Molecular mechanics is a powerful tool frequently used in the conformational study of macromolecules. Although this procedure appears remarkably robust, it is limited by the adoption of a given potential form and its parameters. Even as the potentials become more and more accurate, the question still remains as to how the various parameters and functions of the potential act together to produce the calculated molecular structure. In the hope of answering these questions, parametric sensitivity analysis offers a systematic approach to extend the utility and ultimately increase the reliability of molecular mechanics calculation^.^^^ Additionally, sensitivity analysis can play a role in molecular mechanics by quantitatively assessing structural relations within molecules. This can be accomplished by applying the analysis to the Green’s function matrix, which describes how the molecular structure or internal coordinates throughout the molecule responds to a particular internal force exerted on another internal coordinate or several internal coordinates elsewhere in the molecule. Since an internal force can be viewed as arising due to a hypothetical functional modification of the molecule, this may may ultimately be useful when considering functional modification of a molecule for purposes of structural alteration. The analysis results in the generation of many sensitivity coefficients, which are a set of values representing the response of a specific structural observable to a perturbation in a specific EI Abstract

published in Advance ACS Abstracts, September 1, 1994.

structural or potential ~ a r a m e t e r .The ~ number of coefficients generated is directly proportional to the size of the molecule being analyzed, and thus some means of condensing the data must be developed before this methodology can be effectively applied to the study of macromolecules such as proteins. Principal component analysis, PCA, is one technique which proves very useful in accomplishing such a data condensation. Originally applied in sensitivity analysis to the study of chemical kinetics and reaction mechanism^,^ PCA provides a measure of the overall structural response of a molecule to collective parameter variations. In the present work the variations involve either the potential parameters or a selected set of structural parameters. The present study lays the groundwork for the application of PCA to large biomolecules by first applying the methodology to a conformational study of the 20 naturally-occumng amino acids. The amino acids have been constructed as N-methyl“-acetyl amides (called dipeptides throughout this paper) to provide backbone interactions similar to those found in large proteins. The similarity of Ramachandran plots mapping accessible backbone torsions in both proteins and dipeptides has been proven by high-resolution X-ray crystallography of over 500 protein structures: and thus dipeptides are often studied as models for protein structural interactions. Furthermore, the amino acids and their derivatives are biochemically important in their own right and warrant study. Finally, the analysis is carried out on a larger, more biologically relevant system Avian pancreatic polypeptide (APP), a 36-residue hormone. The minimized structure of APP

0022-365419412098-11204$04.50/0 0 1994 American Chemical Society

Structural Relationships of Dipeptides

J. Phys. Chem., Vol. 98, No. 43, 1994 11205

has similar features to those found experimentally in crystal* and s o l ~ t i o nnamely ,~ a polyproline type helix from residues 2 to 8, an a helix from residues 14 to 28, a flexible B turn region between the two helixes, and a flexible c-terminus. In this paper, section I1 will describe the principal component analysis methodology. Section III will discuss the results of the principal component analysis as applied to the potential energy function (section IIIa) as well as to structure-structure relationships (section IIb). Finally, section IV will present our conclusions.

B. Sensitivity Analysis. The molecular mechanics equations are

where Fi,a is the Cartesian force on the ith atom in the a = x , y , or z direction. The Cartesian position sensitivity coefficients, ari,,/apn, are calculated by taking the explicit and implicit derivative of eq 12 with respect to pn, yielding

11. Methodology The force field used in this study is a modified version of the original consistent valence force field (CVFF) originally developed by Lifson et al.l0 This force field was parametrized primarily for peptides but is capable of modeling other functional groups as well. The functional form of the modified CVFF field consists of 10 energy functions, each containing various parameters. A. CVFF Energy Functions.

or in matrix form:

HS+K=O

(14)

In the above equation, H is the Hessian, S is the matrix of sensitivity coefficients, and K is the inhomogeneity. The desired sensitivity matrix, S, can be calculated by solving eq 14

S=-H‘K

(15)

Here it is understood that in forming H-’ the six eigenvectors corresponding to rigid body motion have been removed. The Cartesian position sensitivities are log normalized to allow for comparisons between sensitivities derived from different parameter types and magnitudes.

Unless otherwise stated, all sensitivities are log normalized with respect to the parameters. For the structure-structure sensitivity analysis, the Green’s function is initially obtained in terms of the Cartesian coordinates from the following relationship:

HG=I b

(17)

where I is the projection into a (3N - 6)-dimensional space, because both rigid translational and rotational motion have been removed. Since an internal coordinate Green’s function is more desirable for physical interpretation purposes the matrix is transformed using the Wilson S-vectors,ll

B

x

Xt

The total potential energy is thus given by

where s k and SI are the particular internal coordinates. The internal coordinates, s k = Sk(f1,...,fn ), where n is the number of internal coordinates, will respond to an internally generated differential force vector, fr, in the following fashion:

ds = G df The set of variable parameters, pn. to be studied in the sensitivity analysis was initially defined to include all parameters in the potential excluding the order (n)and phase (t)parameters. However, after extensive analysis of the results we found it useful to define an abridged parameter set comprised of only the torsion, out-of-plane and nonbond (both electrostatic and van der Waals) terms. In doing so, we exclude a portion of the potential associated with “stiffer” potential terms.

(19)

C. Principal Component Analysis. The log normalized sensitivity coefficient, Si,n = axJa In@,), describes the Cartesian coordinate response resulting from a unit change in the nth parameter. Thus in matrix-vector form we have

Equation 20 contracted upon itself yields

Reisner et al.

11206 J. Phys. Chem., Vol. 98, No. 43, 1994 orthogonal matrix. Thus, eq 27 becomes The scalar, u2, gives a measure of the overall molecular structural response to a combination of parametric disturbances, dP. The ST S matrix can be diagonalized by an orthogonal transformation matrix, V, such that

vT(STs)v = il

(22)

Combining eqs 21 and 22 gives

02 =

cAn[ (:)T*v,,]2 n

We may understand this equation by making a particular parameter change (dplp) proportional to the elements of the eigenvector V,,, to yield,

Each eigenvector, V,, corresponding to a particular A,,, is called a principal component of the parameter space. The principal components corresponding to small eigenvalues (A,,