Chapter 16
The Basic Shape Topology of Protein Interfaces 1
2
John Lawton , Melanie Tudor , and W . Todd Wipke 1
3
Molecular Engineering Laboratories, Department of Chemistry and Biochemistry, University of California at Santa Cruz, Santa Cruz, C A 95064 Department of Chemistry, Mercer University, Macon, G A 31207
Downloaded by GEORGETOWN UNIV on August 25, 2015 | http://pubs.acs.org Publication Date: July 7, 1999 | doi: 10.1021/bk-1999-0719.ch016
2
Basic Shapes are a set of eight differential geometric shape descriptors that capture domain-independent local surface information. This paper describes the use of these shapes to study the surface complementarity of interactions regions in three classes of complexes: protein inhibitor-protein, protein oligomer, anf pro tein D N A . We derive a shape-shape association plot and a shape parameter affinity model (SPAM) that helps in analyzing the degree of shape complementarity.
Shape is an integral part of chemistry, particularly in the area of molecular recogni tion. A t the lowest level, shape dictates the possible orientations that can occur between molecules, which influences their physical properties such as reactivity [1,2], solubil ity [2], and associativity [3,4]. In this paper, we have chosen to study the nature of shape associations at protein interfaces. Previously, we have demonstrated that with a program called QSDock (Quadratic Shape Descriptor Docking Algorithm) [5,6], it is possible to align accurately and ef ficiently similar or complementary molecules using only shape. QSDock uses local surface properties that include surface normals and principal curvatures to determine transformations intended to optimize either the shape similarity or the shape comple mentarity between two molecules. The method was found to be both fast and accurate. It docks molecules two orders of magnitude faster than other docking algorithms, and the docked molecule positions have average root mean square deviations (rmsd) from crystal ligand orientations less than 1.0 A. A key feature of the Q S D o c k program was the explicit use of shape which reduced the computational complexity of docking. Shape informacion can also be used explicitly to improve the accuracy of molecular shape comparisons. This can be done by directly comparing the local shapes of two
3
Corresponding author.
© 1999 American Chemical Society
In Rational Drug Design; Parrill, A., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1999.
239
Downloaded by GEORGETOWN UNIV on August 25, 2015 | http://pubs.acs.org Publication Date: July 7, 1999 | doi: 10.1021/bk-1999-0719.ch016
240
aligned molecular surfaces. Currently shape comparisons are done implicitly, by cal culating the common surface area or common volume between two molecules, using a three dimensional (3D) grid based approach [7]. We were particularly interested in using shape complementarity as a metric for ranking our plausible dockings. Before one can attempt to quantify shape associations, it is important to experi mentally observe the shapes and their interactions in analogous known systems. This raises the obvious question, " H o w complementary in shape are the interface surfaces of molecules that are found to bind in nature?". A second question then follows, "Is the shape complementarity in nature constant, or is it sensitive to the size of the molecules interacting?". The 3D structures of complexes determined from crystal and N M R exper iments are valuable resources that have made it possible to survey the types of shapes and the types of shape associations that occur at the interface regions. This information should provide a better understanding of the degree to which shape complementarity exists at protein interfaces and the importance of shape complementarity for molecular recognition.
Background. In 1986, Connolly used the shape features of protein surfaces as a basis for docking proteins [8]. H i s reasoning was that proteins could be docked by associating a set of complementary features of each surface. H e successfully docked the alpha and beta subunits of hemoglobin by matching a set of three or four peaks to a set of pits. W h i l e he was not able to elucidate all the docked conformations in his test set, the novel use of shape gave us a glimpse of the potential of explicit shape representation for molecular shape comparison. Since then, several other groups have presented work based on the perception of local and global shape properties of molecules [9-16]. In this paper we present a topological survey of the basic shapes [ 17,18] of protein i n terfaces for three classes of protein complexes. The basic shapes are a set of differential geometric shape descriptors that capture domain-independent local surface information. The intent of this work is to lay the computational groundwork for quantitative measure of molecular shape complementarity or similarity.
Shape: Local Versus Global. We choose to focus on the local aspect of molec ular shape. Hence, the shape of each molecule is broken down to a set of shapes dis tributed over the whole surface. This choice underscores our interest in sub-shape com parisons, and ultimately in developing methodology for quantitatively measuring the shape complementarity between two objects that may differ in size.
Experimental The 3D atom coordinates used in this work were taken from the Brookhaven Protein Databank. [19] The dataset consists of three classes: protein inhibitor-protein complexes ( P I - P R ) , protein oligomer complexes ( P - O L I ) , and p r o t e i n - D N A complexes ( P - D N A ) . A complete listing of all P D B files used in this work is included in Table 1.
Protein Complex Classes.
The P I - P R class features proteins that present a por tion of their backbone as the ligand to a receptor on another protein. The ligand region of
In Rational Drug Design; Parrill, A., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1999.
241
Table 1: Protein complexes used in this study
Downloaded by GEORGETOWN UNIV on August 25, 2015 | http://pubs.acs.org Publication Date: July 7, 1999 | doi: 10.1021/bk-1999-0719.ch016
PDB lcho 2ptc 2sec 2sni lbov llyn 2pab 4hvp
Protein inhibitor-Proteins oj-Chymotrypsin/turkey ovomucoid (3rd domain) [20] Trypsin/PTI [21] Subtilisin/elgin-c [22] Subtilisin novo/chymotrypsin inhibitor 2 [22] Protein Oligomers Verotoxin-1 [23] Sperm lysin [24] Phosphofruktokinase [25] HIV-1 protease [26]
lcgp lgat ltsr
Protein-DNA Catabolite gene activator protein(CAP)-DNA complex [27] Erythroid transcription factor G A T A - 1 [28] P53 core domain protein-DNA complex [29]
R(A) 1.8 1.9 1.8 2.1 2.2 2.75 1.8 2.3 3.0 N/A 2.2
the backbone interacts with the other protein in a fashion analogous to a peptidic ligandreceptor complex. These interactions typically have a greater percentage of electrostatic character than typical protein interfaces. [3] In terms of shape associations, it is ex pected that at short distances, the protein inhibitor associations with the receptor should be primarily between convex shapes on the inhibitor and concave shapes on the receptor. A s the distance increases we expect the shape associations to be analogous to protein oligomers, which associate mainly through non-specific Van der Waals interactions. The P - O L I complexes used in this study consisted of identical subunits. The shape distributions for each subunit are expected to be highly similar, and the shape associa tions should be symmetrical. The proteins in the P - D N A class were constrained to have interactions with the ma jor groove of D N A , although they were not precluded from interacting with the minor groove. The major groove of D N A appears as a smooth surface that flows along the base pair trajectory. The local shape of these surfaces is not expected to be high in informa tion. However, the recognition of D N A is a chemical phenomenon, where the protein interacts with a specific sequence of base pairs. [30] If chemical recognition occurs at the base pairs, shape profiles for the protein may be similar to the shape profiles of the P - O L I protein inhibitor.
Shape Perception.
The perception of molecular shape is a two step process that first involves the generation of a molecular surface, followed by the characterization of the local shape for each point on the molecular surface. We have chosen to use the Connolly M s - D o t program [31] using a probe radius of 1.4 A and a surface point density of 4 pts/A for generating molecular surfaces. The Connolly solvent accessible surface [32,33] provides a continuous surface which is required by our subsequent shape characterization methodology. We then smoothed the solvent accessible surface using 2
In Rational Drug Design; Parrill, A., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1999.
242
Downloaded by GEORGETOWN UNIV on August 25, 2015 | http://pubs.acs.org Publication Date: July 7, 1999 | doi: 10.1021/bk-1999-0719.ch016
a simple convolution algorithm to remove local surface undulations. The convolution algorithm works by averaging the 3 D coordinates of each point and its neighbors within a distance d = 2.0 A. This gives a smoothed molecular surface ( S M S ) which is used as the basis for shape characterization. The Connolly solvent accessible surface and our smoothed molecular surface of methotrexate are pictured in Figure 1 for comparison.
Figure 1: Rendered Connolly Molecular surface (left) and a rendered smoothed molec ular surface (right) of methotrexate. The emphlocal range curvatures [5,6] were calculated for each point on the S M S , by fitting a polynomial of the form ax + bxy + cy to a circular surface patch of radius r = 2.0 A. Determination of the least squares estimators (3 [34] for parameters a, b, and c, gave the coefficients of the second fundamental form II or the Hessian matrix. For each point, the principal directions, the direction of the minimum curvature (fc m) and maximum curvature (k ), were calculated by determining the eigenvectors of II. [35] 2
2
m
max
Gaussian and Mean Curvature. The Gaussian curvature (K) and the mean curvature (H) were used to classify shapes on the molecular surface. Gaussian and mean curvature represent the local second-order surface characteristics that possess the necessary invariance properties for this work. [17] The values for the Gaussian curvature and the mean curvature were computed from the principle curvatures k i and kmax using Equations 1 and 2, respectively. m
n
(1)
(2) The Gaussian curvature is the product of the principal curvatures and the mean curvature is the average of the principal curvatures. Compared to the principal curvatures, the Gaussian curvature is more sensitive to noise and the mean curvature is less sensitive
In Rational Drug Design; Parrill, A., et al.; ACS Symposium Series; American Chemical Society: Washington, DC, 1999.
243
Downloaded by GEORGETOWN UNIV on August 25, 2015 | http://pubs.acs.org Publication Date: July 7, 1999 | doi: 10.1021/bk-1999-0719.ch016
to noise in the surface points. Additionally, Gaussian curvature is an intrinsic property of the surface which makes it insensitive to its orientation. The Gaussian curvature represents the continuity of the curvature of a surface region; it is positive for both concave and convex regions, negative for saddle regions and zero for flat regions. The mean curvature is an extrinsic property of the surface and is sensitive to its orientation. Its sign is negative for convex regions and positive for concave regions on the surface.
The Basic Shapes, The basic shapes are derived from the signs of the Gaussian and mean curvature of the surface, which yields eight basic surface types: peak, ridge, saddle ridge, flat, minimal saddle, saddle valley, valley, and pit. The use of Gaussian and mean curvature, as opposed to the principal curvatures, allows saddle shapes to be resolved into saddle ridge, saddle valley and minimal saddle. Table 2 depicts the mapping of Gaussian and mean curvature to surface type. The signs of the Gaussian and mean curvature are computed using Equation 3
{
+
if
K>z
0
if-z