Chiral Ramachandran Plots I: Glycine - Biochemistry (ACS Publications)

Sep 5, 2017 - Ramachandran plots (RPs) map the wealth of conformations of the polypeptide backbone and are widely used to characterize protein ...
1 downloads 0 Views 3MB Size
Article Cite This: Biochemistry 2017, 56, 5635-5643

pubs.acs.org/biochemistry

Chiral Ramachandran Plots I: Glycine Yael Baruch-Shpigler,† Huan Wang,†,‡ Inbal Tuvi-Arad,*,‡ and David Avnir*,† †

Institute of Chemistry, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel Department of Natural Sciences, The Open University of Israel, Raanana 4353701, Israel



S Supporting Information *

ABSTRACT: Ramachandran plots (RPs) map the wealth of conformations of the polypeptide backbone and are widely used to characterize protein structures. A limitation of the RPs is that they are based solely on two dihedral angles for each amino acid residue and provide therefore only a partial picture of the conformational richness of the protein. Here we extend the structural RP analysis of proteins from a two-dimensional (2D) map to a three-dimensional map by adding the quantitative degree of chiralitythe continuous chirality measure (CCM)of the amino acid residue at each point in the RP. This measure encompasses all bond angles and bond lengths of an amino acid residue. We focus in this report on glycine (Gly) because, due to its flexibility, it occupies a large portion of the 2D map, thus allowing a detailed study of the chirality measure, and in order to evaluate the justification of classically labeling Gly as the only achiral amino acid. We have analyzed in detail 4366 Gly residues extracted from high resolution crystallographic data of 160 proteins. This analysis reveals not only that Gly is practically always conformationally chiral, but that upon comparing with the backbone of all amino acids, the quantitative chirality values of Gly are of similar magnitudes to those of the (chiral) amino acids. Structural trends and energetic considerations are discussed in detail. Generally we show that adding chirality to Ramachandran plots creates far more informative plots that highlight the sensitivity of the protein structure to minor conformational changes.

dimension parameter which takes into account all bond angles and bond lengths, namely, the quantitative value of the degree of chirality of an amino acid residue within the polypeptide chain. These additional data will tell us what is the continuous chirality measure (CCM,5−7 explained below) of that specific residue, for each point in the map. The classical “yes-or-no” language of chirality, that is, it either exists or not, is too limiting when describing the conformational richness of proteins and a quantitative scale for that structural property is used instead. As will be evident below, the flat (black-andwhite) 2D RP will be colored into a three-dimensional (3D) map with various degrees of chirality, adding significantly to the structural information that the map provides: Not only the (ϕ, ψ) pair is given, but also what characterizes the conformation at that point. Of course, all amino acids are structurally labeled as chiral (except Gly), but that label is only a first order description of that property: The flexibility of the amino acids which allows specific conformational adaption in their backbone location reflects a much richer chirality picture, that changes continuously and is determined by all bond angles and all bond lengths, which may vary, sometimes significantly, from one location to the other; the CCM quantifies this variability and adds the third dimension to the plot. We selected Gly for this study based on two main reasons: The first is the flexibility of this smallest amino acid, due to the

Ramachandran plots (RPs) are a key tool in structural proteins studies and have been applied repeatedly and successfully for the investigation and validation of protein structures, since their inception by Ramachandran et al. in 1963.1 The RP is a twodimensional (2D) map based on two types of dihedral angles within the polypeptide backbone, ϕ and ψ, defined in Figure 1. Each data point in the plot represents the dihedral angles pair (ϕ, ψ) of a single amino acid residue in the polypeptide chain. The collection of all such data points provides a plain view of the backbone conformations that characterize a given protein or of the typical amino acid conformations found across the secondary structures of many proteins. There are four prototype RPs, out of which we focus in this report on the glycine (Gly) RP, which, as the name implies, collects the (ϕ, ψ) pairs of the Gly residues. As seen in the Gly map presented in Figure 2which we generated from 4366 Gly residues taken from 160 proteins (see Methods, below)the dots of the map tend to cluster in specific zones. Clustering is typical of all types of the RPs, and reflects, as mentioned above, the preferred conformations of the amino acids within the folded backbone of the protein.2−4 Part of the success of the RPs is due to the simplification it has offered in presenting the rather complex structure of proteins. However, that simplification is both the strength and the weakness of these maps: Simplification gives away some of the conformational richness which characterizes each of the amino acid residues by reducing the description of the conformer to two specific geometrical parameters (the two dihedral angles). Here we propose to add to the 2D RP a third © 2017 American Chemical Society

Received: June 1, 2017 Revised: August 24, 2017 Published: September 5, 2017 5635

DOI: 10.1021/acs.biochem.7b00525 Biochemistry 2017, 56, 5635−5643

Article

Biochemistry

quality of fitting a simulated diffraction pattern to the analyzed experimental diffraction pattern, was “much better than the average at their resolution” according to FirstGlance in Jmol;12 (vi) the database was further filtered by checking the organism source; if the proteins obtained from the same organism source shared identical leading-codes of PDB IDs, we double checked their structures and chose one of them as candidate. A full list of the proteins used in this work is provided as Supporting Information (SI). Cleaning the PDB Files - the PDBcleaner. Prior to analysis, each protein in our database was cleaned with our home-built Python script, PDBcleaner, to delete ligands, solvents, noncoordinates lines (e.g., ANISOU data representing anisotropic temperature factors) from the ATOM section in the PDB file,8,9 and choose the first location in cases of alternate locations of specific residues. Cutting Proteins into Subunits - the PDBslicer. Our inhouse Perl program, PDBslicer, was used to extract 4366 Gly residues from the 160 proteins in our database. PDBslicer cuts each peptide chain into subunits according to the user’s preferences. The Continuous Chirality Measure. The continuous chirality measure (CCM) is a special case of the more general continuous symmetry measure (CSM), a computational tool which enables one to determine, on a continuous scale, the degree of the content of any symmetry point group, G, of any (molecular) structure.5−7 It is a distance function that evaluates the minimal distance that the points of an object have to move, in order to be transformed into a shape of the desired symmetry that retains the chemical identity of each vertex and the connectivity map of the original structure. Mathematically it is defined as

Figure 1. A glycine residue within the protein. The Ramachandran dihedral angles ϕ (the dihedral angle around N−Cα bond) and ψ (the dihedral angle around Cα−C bond) are marked in blue. Atoms in red define the Ramachandran subunit which is used to calculate the chirality.

Figure 2. A Gly Ramachandran plot, obtained from 4366 Gly residues extracted from 160 proteins. Quadrant regions are marked with red Roman numerals.

lack of the β-carbon, which causes its appearance in all four quadrants of the RP. This property has been advantageous for this report, as it allowed the exploration of a wide variety of (ϕ, ψ) conformational pairs;4 and second, we have wished to emphasize the fact that the traditional label of Gly as an achiral amino acid misses by far the fact that within proteins Gly is always chiral to some degree. That induced chirality, its level, the handedness of the induced chirality, the relation between the (ϕ, ψ) pair and the chirality, the relation between the degree of chirality of Gly (the CCM value) and its occupying specific secondary structuresall are data currently unavailable to the protein community. We show in this study that introducing chirality to the RP is an eye opener, adding a wealth of structural information which has not been available so far in an easily accessed manner, adding new insight into the fine details of proteins structures, as detailed below.

N

S(G ) =

100 ·min ∑ |Q k − Pk |2 2 d k=1

(1)

Here G is the studied point group symmetry, {Qk} is the set of coordinates of the studied structure, {Pk} is the computationally searched corresponding set of coordinates of the nearest G-symmetric configuration, N is the number of vertices in the object, and d2 is a size normalization factor  the mean square of the original object vertices distances from its center of mass, Q0: N

d2 =



∑ |Q k − Q 0|2 k=1

METHODS Creating the Database of the Proteins - Selection Criteria. The coordinates of 160 proteins used in this work were extracted from the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB).8,9 In order to ensure that only high quality proteins with minimal statistical bias10,11 will be selected, several criteria for filtering the proteins were set: (i) The experimental method was X-ray crystallography with a resolution of 1.6 Å or better, and an “Excellent” grade as defined by FirstGlance in Jmol;12 (ii) Only monomeric chains were used, and if the PDB file included more than one peptide chain, the longest chain was chosen and the first chain in cases of equal lengths. (iii) DNA, RNA, or hybrid chains were filtered out. (iv) Averaged B-factors, representing the mean square isotropic displacement of each atom,13 higher than 30 Å2 were excluded;14 (v) Rfree grade,15 which measures the

(2)

The values of S(G) can range from zero (the object is perfectly G-symmetric) up to (rarely reached) 100. When operated as a CCM,16 eq 1 evaluates the distance of the studied structure from the closest achiral symmetry point group, namely, from an improper Sn point group symmetry, such as mirror symmetry and inversion symmetry. The CCM is therefore defined as CCM = min[S(Sn)]; n = 1, 2, 4, 6, 8... For the main subunit analyzed in this reportthe Ramachandran subunit (Figure 1), S(G) is always S(Cs), that is, the distance from having reflection symmetry (this is because any attempt to find a structure with higher Sn symmetry will result in higher CCM values). As for the parent CSM, the CCM is continuous. A value of zero represents an achiral structure and the measure increases as the structure is more chiral. The two central properties of the CCM are first that unlike specific geometric parameters (such as individual bond angle and bond 5636

DOI: 10.1021/acs.biochem.7b00525 Biochemistry 2017, 56, 5635−5643

Article

Biochemistry length), the CCM is a global parameter which takes into account all of the bond angles and lengths, and is therefore special in that it provides thermodynamic-like parameter that encompasses all effects into a characteristic chirality value; and second, that the nearest achiral structure is not fixed, but searched. It should be noted that the nearest achiral structure is a mathematical construct which varies for each analyzed structure (for instance, for each Gly residue). Many correlations between the CSM or the CCM and chemical, physical, or biochemical properties that depend on symmetry or chirality have been successfully established;17−23 alternative approaches have been proposed by, e.g., Jamroz et al.24 Chiral Ramachandran Plots. The dihedral angles ϕ and ψ of the entire protein were retrieved using Jmol.25 Terminus Gly residues were discarded since their (ϕ, ψ) angles are not defined. Similarly, Gly residues adjacent to a sequence gap resulting from missing residues before or after Gly units, were discarded. For generating the chiral Ramachandran plots, the minimal set of atoms which defines and contains the dihedral angles ϕ and ψ and which is common to all amino acid residues was selected: it is the Ramachandran subunit, Ci−1−Ni−Cα,i− Ci−Ni+1, for the ith residue (see the red atoms in Figure 1). For generating the chiral Ramachandran plots, S(Cs) of the Gly Ramachandran subunit was determined for each Gly, and its value is represented on a color scale. In the final portion of the paper we also comment on the effects of changing from the Ramachandran subunit to the smallest possible subunit which carries the conformational chirality of all amino acids, namely, the Ni−Cα,i−Ci−Oi skeleton for the ith residue (see the atoms in red in Figure 10). For assigning secondary structures of the protein within which the Gly resides, we used the Dictionary of Secondary Structures of Proteins (DSSP).26,27 Potential Energy Scan. Gas phase potential energy surface scan calculations were performed on the model molecule Nacetyl-glycine-methylamide used by Yuan et al.,28 which is very much alike the model molecule originally used by Ramachandran.16 For this purpose we used Gaussian0929 at the B3LYP30−33 level with the 6-31G(d,p) basis set. This method was proven adequate for similar molecules.28 The dihedral angles ϕ and ψ were used as redundant coordinates; that is, they were fixed while optimizing the other coordinates. The resolution of the scan was 4°, and a total of 8100 structures were calculated. Using these structures, we then generated the potential energy surface of the model molecule as a function of the Ramachandran angles and calculated the CCM value for every optimized geometry.

Figure 3. Chiral Ramachandran plot of Gly (4366 glycine residues from 160 proteins, as in Figure 2). The color code for the degree of chirality (CCM) is on the right.

The chiral RP retains (by definition) the original features of the regular RP plot, and we note that not only does the basic (ϕ, ψ) pairs map has that symmetry but the chirality map values are nearly C2-symmetric, with only very few Gly residues along ϕ = (−30°, 30°). This symmetry, which is unique to the Gly-RP, is not seen in the other types of RP.3,4 It emerges for Gly because this amino acid lacks a β-carbon side-chain, allowing it the flexibility to adopt ϕ and ψ angles in all four quadrants.34 Furthermore, that conformational freedom allows having for every (ϕ, ψ) point, the reciprocal (−ϕ, −ψ) point (or at least a near-by reciprocal point), and this is the main source of the observed C2-symmetry. That richness of chiral conformers is because the Gly residues are not isolated species in a vacuum but are part of the polypeptide backbone, so that their immediate environment changes due to the immediate residue neighbors, to the local secondary structure, to the near in-space interactions, and so on. One therefore expects to see imperfect C2-symmetry but approximate one. Indeed, the quadrant populations are as follows (Figures 2 and 3): Quadrant I (ϕ > 0, ψ > 0) has 1270 (≈29%) Gly residues; quadrant II (ϕ < 0, ψ > 0) contains 807 (18%) Gly residues; quadrant III (ϕ < 0, ψ < 0) involves 1206 (28%) Gly residues; and quadrant IV (ϕ > 0, ψ < 0) remains 1081 (25%) Gly residues. Interestingly, on the borders of the quadrants, one finds only two Gly residues that have a ψ = 0 value, but these are still chiral conformers due to their nonzero ϕ value (Figure 3); furthermore, there are no points with ϕ = 0. We return to this issue in detail below. The reciprocal pairs (ϕ, ψ) and (−ϕ, −ψ) represent pairs of enantiomers, because they are non-superimposable mirrorimages of each other: Such two points in quadrants I and III comprise a pair of enantiomeric conformers with opposite handedness, and so are pairs of (−ϕ, ψ) and (ϕ, −ψ) in quadrants II and IV. Likewise, the relation between points of the same absolute values of the dihedral angles in quadrants I and II, I and IV, II and III, as well as II and IV are all diastereomeric. Taking into account that all of these possible stereochemical pairs reside in the same chiral environment of the L-enantiomers of the amino acid in a protein, the expectation is that the interactions of the stereoisomers with the L-handed environment will be different in the four quadrants, and so this is an additional aspect of the differences in the population of the quadrants. A special feature of the chirality map is that exact (ϕ, ψ) and (−ϕ, −ψ) enantiomer pairs and (−ϕ, ψ) and (ϕ, −ψ) enantiomer pairs need not be common (depending on the defined cutoff accuracy of the similarity between the absolute values). Finally, note that, in



RESULTS AND DISCUSSION The Chiral Ramachandran Plot − General Observations. The chiral Gly-RP is shown in Figure 3. The chirality measure parameter adds a third (z-axis) dimensionality, and either the color or the height (see Figure S1 in the Supporting Information for a 3D representation) are indicative of the degree of chirality as represented by the continuous chirality measure (the CCM). It is clearly evident that, compared to the black-and-white classical RP (Figure 2), the structural information content of the chirality plot is much richer, and we analyze it in detail next and below. Generally, it is seen that the chirality values cluster in the following way: The highest chirality values of the Gly Ramachandran subunits cluster in an approximately C2-symmetrically (2D-inversion) located islands around ϕ = (±90°, ± 45°), ψ = (±60°, ± 30°), while the lowest chirality values cluster at the four corners of the map. 5637

DOI: 10.1021/acs.biochem.7b00525 Biochemistry 2017, 56, 5635−5643

Article

Biochemistry

Figure 4. Polar plots of the chirality value (the CCM) of Gly residues in proteins. Each dot is represented by a dihedral angle (a, ϕ and b, ψ) and a radius (CCM). Colors represent the absolute values of the other dihedral angle.

principle, the chiral RP contains only five exact points which are achiral: The center point (not occupied in Figure 3) and the four corners, where all the atoms of the C−N−Cα−C−N subunit are coplanar. The rarity of these conformers emphasizes the importance of the chirality analysis of the RPpractically all of the (ϕ,ψ) plane represents chiral conformers. Polar plots provide another interesting way to represent the chirality of Gly residues within the proteinsin Figure 4a we present the CCM of the Gly residues (same database) as a function of ϕ with a color code for the angle ψ, and similarly for the CCM as a function of ψ with a color code for the angle ϕ (Figure 4b). Note that the approximate C2 symmetry of the Cartesian RP (Figure 3) becomes a reflection symmetry in the polar plot and that CCM(ϕ) ≈ CCM(−ϕ) for any ψ. One can also see that the sector defined by the range −30° ≤ ϕ ≤ 30° is only sparsely populated, as discussed above. It is interesting to note that when 135° ≤ ψ ≤ 180° or −180° ≤ ψ ≤ − 135° (red dots on the plot), ϕ has a large range of values, while for −45° ≤ ψ ≤ 45° (cyan dots) ϕ is limited between 30° ≤ ϕ ≤ 90° or −90° ≤ ϕ ≤ − 30° with higher chirality values: the most pronounced chirality is obtained when both angles are around ±60° (light steel blue dots). From the point of view of the secondary structures, these are the regions of the right and left α-helix structures (see also below). The lack of exact achiral Gly residue is, again, striking, with only a few points near the origin of each plot. Role of Conformational Energy. In order to provide interpretations of these observed features, we resort now to a classical argument in chemistry, namely, that conformer populations are generally dictated by energy considerations, where preferred conformers are usually the lower-energy geometries. What then is the role of conformational energy in dictating the specific features of the Gly-RP and of the chiral RP? For an estimation of the potential role of conformational energy in dictating the details of the RP, a model molecule, Nacetyl-glycine-methylamide, Figure 5, was taken. This molecule mimics a Gly residue within a polypeptide chain. Similar models have been used for that purpose in previous studies.3,28,35 The energy of the model molecule was calculated as a function of ϕ and ψ in the gas phase (see the Methods section for details). Figure 6 presents the resulting potential energy surface as a contour map in which the relative energy is described by a color scale. Figure 7 shows an overlay of the RP in Figure 3 on the energy contour (Figure 6). Evidently, energy

Figure 5. Model molecule, N-acetyl-glycine-methylamide. Ramachandran angles.

Figure 6. Contour plot of energy as a function of the two Ramachandran dihedral angles, for the model molecule presented in Figure 5. Color bar on the right shows the relative energy scale in kcal/ mol.

considerations can indeed explain some of the main features of the Gly-RP: The sparsely populated strip in the range ϕ = (−30°, 30°) is a zone of relatively high energy, since the conformations in this strip bring the two oxygen atoms to highly overcrowded proximity while typical conformers of the low energy regions are free of such steric hindrance, and in some cases are also stabilized by hydrogen bonds. Typical conformers representing the various zones of the energy RP are shown in Figure S2 of the Supporting Information. On the other hand, the highly populated four corners of the RP as well as the strips around ϕ = ± 175° (for most values of ψ) are characterized by lower energy conformers. Unexpectedly, the sparsely populated strips around (ϕ = ± 60°, ψ ≈ ± 90°) are characterized by low energy conformers. Lovell et al.4 addressed this issue and proposed that crowding of two NH neighboring groups at these angles permits hydrogen bonding to only a single NH acceptor. 5638

DOI: 10.1021/acs.biochem.7b00525 Biochemistry 2017, 56, 5635−5643

Article

Biochemistry

the map; let us investigate this secondary structures map. Above we discussed the nearly C2-symmetric nature of the Gly chiral RP; it is now seen, especially in the two central clusters of dots, that the color symmetry is lost; that is, despite the fact that the symmetry of the (ϕ, ψ) map occupation is retained, the secondary structures behind these dots are quite different. For instance, there is no massive occupation of Gly in enantiomeric left-handed helices in the symmetric position to the right handed α-helix zone, and that cluster in the positive (ϕ,ψ) zone is mainly composed of bend structures and hydrogen-bonded turns. The fact that the symmetry of the classical RP maintained without restrictions of secondary structures is the strongest indication to the high flexibility and secondary structure adaptation of the Gly residue. Next we analyze the CCM values associated with the secondary structures map. Figure 9a presents a box and whisker diagram of the statistical distribution of the CCM values within the various secondary structures. It is seen that the Gly residues belonging to α-helices structures are distinctly of higher chirality values, whereas the Gly residues in extended strands in the β-ladders reach lower chirality values. For a deeper look into these trends we present in Figure 9b the CCM distribution within each secondary structure. As is evident, Gly in hydrogen bond turns (“T”, black), loops and irregular elements (“C”, red) and bend structures (“S”, blue) have large conformational freedom, and therefore their CCM values span the whole range of chirality levels from zero to ca. 7. On the other hand, Gly in α-helices (“H”, magenta) and β-sheets (“E”, green) have less conformational freedom and are thus confined to more specific ranges of chirality levels, with the α-helices being, generally, more chiral than the β-sheets. β-Sheets are composed of relatively flat β-ladder segments, connected laterally to one another by two or three hydrogen bonds, and in order for that to be possible, the Gly conformation must be relatively planar. Planarity of the C−N−Cα−C−N subunit is obtained when both of the dihedral angles are near ±180° as can be seen at the green points zone in Figure 8. Gly residues are clustered in the known α-helix region of the RP, for which (ϕ, ψ) ∈ (−90°, −30°), and as seen from the chirality values, that secondary structure has a strong conformational forcing effect on Gly residing in it: Relatively high chirality values appear there; that is, the energy cost of Gly distortion is secondary to the righthanded helical demand of that secondary structure. The “N−Cα−C−O” Subunit. As the main focus of this report is the RP, the analyzed subunit in all of the above has been, accordingly, the Ramachandran backbone subunit C−N− Cα−C−N (Figure 1). The methodology developed here is, however, general and other relevant subunits of interest can be selected and analyzed. We focus now on one such key subunit which is common to all amino acids, namely, the smallest possible subunit which defines the conformational chirality of any amino acid, that is, the N−Cα−C−O subunit (shown in Figure 10). Exact coplanarity of the four atoms within the protein is rare, and the nonplanarity is characterized by the dihedral angle, θ, around the Cα−C bond, which, due to the planarity of amide group (N−CO) directly relates to ψ as |ψ − θ| ≈ 180°. For simplicity and consistency of the text we shall therefore continue to use the angle ψ in what follows. The angle ϕ is not inherent to the N−Cα−C−O subunit, and indeed the chirality RP of this subunit presented in Figure 11 shows that for a given ψ, the CCM varies only slightly with ϕ, leading to the stripes pattern of that map. A major difference between the C−N−Cα−C−N subunit chiral RP (see Figure 3)

Figure 7. Overlay of the chiral RP (Figure 3) on the energy map of Figure 6. CCM values are in color. The energy scale is presented in gray scale.

Concentrating on the chirality values presented in Figure 7, it is seen that, by and large, they comprise a structural characteristic which is only loosely dependent on energetics. Thus, along the low energy strips around ϕ = ± 75°, practically all of the color codes for the chirality values can be found. This observation indicates that local Gly conformation is dictated by parameters other than solely energy, including the secondary structure (discussed next), the immediate neighboring residues, the local folding geometry, the ability to form near-by hydrogen bonds, and even the function of the protein as a whole. All of these can force not only a rich range of possible Gly conformers where energetics is not very prohibitive, but even some Gly residues which reside in the high-energy in the range (−30° ≤ ϕ ≤ 30°). A polar version of Figure 7 is provided as Supporting Information (Figure S3). Secondary Structures and Chirality. Next we investigated in what way the specific secondary structure where a Gly residue resides affects its chirality value. In order to explore these effects, we first produced the Gly secondary-structures RP, based on our data source of 4366 Gly residues (see Methods section). The resulting map is shown in Figure 8 (see also Figure S4 of the Supporting Information). As expected, the various secondary structures tend to cluster in specific zones of

Figure 8. Secondary-structures Gly Ramachandran plot. Colors indicate secondary structures: T = Hydrogen bonded turn (black); C = loops and irregular elements (red); S = bend (blue); H = α-helix (magenta); E = extended strand in β-sheet (green); O = others (orange). 5639

DOI: 10.1021/acs.biochem.7b00525 Biochemistry 2017, 56, 5635−5643

Article

Biochemistry

Figure 9. (a) Box and whisker plot of CCM values of Gly residues in the different secondary structures. The boxes are between the 25th percentile and the 75th percentile; the mean and median values are represented by the diamond shape and by the horizontal line inside the boxes; the minimum and maximum values are marked with an up and down triangles at the bottom and top of each box; the whiskers represent the 5th to 95th percentile. See Figure 8 for secondary structure abbreviations. The total number of Gly residues in each group is presented above each box. (b) Data distribution of the boxes in a.

Figure 10. The N−Cα−C−O subunit (marked red) of Gly within the protein, and its dihedral angle θ (around the Cα−C bond, marked blue. |ψ − θ| ≈ 180°).

Figure 12. Chirality measure of the N−Cα−C−O subunit as a function of the angle ψ and of the secondary structure within which it resides. (See Figure 8 for the secondary structures code.)

the subunit around the Cα−C bond. As the case with the C− N−Cα−C−N subunit, the most dictating secondary structures are the α-helices (magenta, see Figure 13a) and the β-sheets (green, see Figure 13b). The explanation of the clustering of the dots shown in these two figures is similar to the explanation provided in the case of C−N−Cα−C−N subunit maps. Chirality of Gly and Its Handedness. Standard teaching in elementary biochemistry books is that Gly is an achiral amino acid, in fact the only one of the protein-building amino acids.36−38 It is clear from all of the above that this label of Gly misses its rich stereochemistry and its role in the various secondary structures. Gly is always conformationally chiral to some degree. The question is then, is the chirality of Gly comparable to the chirality of the other amino acids? We shall answer the question in the framework of the N−Cα−C−O subunit, which is common to all amino acids. Gly is special as it does not have a side chain and is therefore much more flexible. As is evident from Figure 14 which represents the distribution of CCM values in our set of 160 proteins, the range of CCM levels (zero to ca. 5.0) of Gly is similar to that of all other amino acids. The cumulative frequency curve of the CCM distribution of Gly is somewhat steeper than that of all the other amino acids, but this is to be expected given the fact that Gly has higher flexibility to be planar. Evidently, the chirality of the N−Cα−C−O subunit is related to the complexity of the 3D structure of the whole protein, regardless of the specific side chain structure of each amino acid.

Figure 11. Chiral N−Cα−C−O subunit RP of Gly (4366 glycine residues from 160 proteins, as in Figure 2).

and the N−Cα−C−O subunit map (see Figure 11) is apparent around ψ ≈ 0for the N−Cα−C−O subunit these values mean near planarity and therefore low degrees of chirality. The map in Figure 11 carries another conclusion: In the Gly RP, the (ϕ, ψ) angles are quite independent of each other and do not impose a restrictionfor any given ϕ one can find dots representing the full scale of CCM values. Because of the independence of the N−Cα−C−O subunit on ϕ, it is of relevance to observe also the direct dependence of the CCM on ψ: A beautiful, well-defined sinusoidal 2D map is obtained (see Figure 12). That sinusoidal shape reflects the full rotations around the Cα−C bond. Note that the maximal CCM value is not at 90° (as would have been if the substituents on the rotating C−C bond would have been the same, e.g., as in ethane), but shifted to ca. 70°; this reflects the asymmetry of 5640

DOI: 10.1021/acs.biochem.7b00525 Biochemistry 2017, 56, 5635−5643

Article

Biochemistry

Figure 13. Chirality measures of the N−Cα−C−O subunit as a function of the angle ψ in (a) α-helices; (b) β-sheets. Data are extracted from Figure 12.

Figure 16. Handedness relative frequencies among the Gly residues in the different secondary structures (see Figure 8 for abbreviations code). The data set includes 2077 left-handed M-Gly residues (black) and 2287 right-handed P-Gly residues (red). Two Gly residues with ψ = 0 were excluded from this plot. Percentages refer to the weight of each secondary structure within the Gly residues of specific handedness.

Figure 14. Relative (left scale) and cumulative (right scale) frequencies of CCM values of the N−Cα−C−O subunit for Gly (red) and for all other amino acids (black) within our set of 160 proteins. Bin size was set to 0.1.

Being chiral, then the next question is, what is the handedness of the chiral Gly; that is, what is a left-handed Gly and what is a right-handed one? Obviously, the L,D labeling method is irrelevant for this question, and so are the stereogenic center Cahn−Ingold−Prelog (CIP) R,S rules.39 Instead, we use the key chiral feature of the N−Cα−C−O subunit, namely, the dihedral angle around the Cα−C bond. Handedness assignment for such chiral species can be made by the P,M label assignment rule:40 P is a right-handed helix and its enantiomer refers to the left-handed M (Figure 15). Thus, positive θ values (or negative ψ values (recall that |ψ − θ| ≈ 180°)) define a right-handed Gly (“Right” = P), while negative θ values (or positive ψ values) define the left-handed enantiomer (“Left” = M). In terms of the chiral RP, it is divided along the ψ = 0 line into right-handed Gly residues (negative ψ) and left-handed ones (positive ψ). Figure 16

shows how the two enantiomers are distributed among the secondary structures. It is seen that the α-helix imposes enantio-purity of Gly, which is dictated by the right-handed helical nature of that secondary structure; Gly is right-handed within this secondary structure. Also seen is a preference of the M enantiomer when the Gly is part of a turn or a bend; we assume that this is a reflection of the fact that the bending takes place in an environment which is enantio-pure, as dictated by the other amino acids. Finally we comment that in the case of the other amino acids, the L,D labels and the M,P labels may, or may not coincide, because the latter refers to the conformer within the protein backbone.



CONCLUSIONS Glycine, the smallest and most flexible amino acid, is generally chiral within the protein structure. Its chirality is conformational but stable in the sense that once a protein structure is fixed, so is the specific chiral conformer of Gly at each of its locations: the 3D folded protein with its internal chiral structure acts as a set of cages which preserve the local conformer. That 3D all-chiral environment renders the probability to find a perfect achiral conformation to be extremely low. In this paper we quantified the level of chirality of Gly residues taken from 160 proteins and showed that adding the chirality level to the Ramachandran plot reveals a wealth of information on their structure within proteins. Let us summarize the main findings: The Ramachandran plot of Gly is

Figure 15. A pair of M (left, ψ > 0, θ < 0) and P (right, ψ < 0, θ > 0) N−Cα−C−O subunit enantiomers, common to all amino acids, including Gly, within a protein. 5641

DOI: 10.1021/acs.biochem.7b00525 Biochemistry 2017, 56, 5635−5643

Article

Biochemistry

(4) Lovell, S. C., Davis, I. W., Adrendall, W. B., de Bakker, P. I. W., Word, J. M., Prisant, M. G., Richardson, J. S., and Richardson, D. C. (2003) Structure validation by C alpha geometry: phi,psi and C beta deviation. Proteins: Struct., Funct., Genet. 50, 437−450. (5) Zabrodsky, H., Peleg, S., and Avnir, D. (1992) Continuous symmetry measures. J. Am. Chem. Soc. 114, 7843−7851. (6) Pinsky, M., Dryzun, C., Casanova, D., Alemany, P., and Avnir, D. (2008) Analytical methods for calculating continuous symmetry measures and the chirality measure. J. Comput. Chem. 29, 2712−2721. (7) Alon, G., and Tuvi-Arad, I. (2017) Improved algorithms for symmetry analysis: Structure preserving permutations. J. Math. Chem., DOI: 10.1007/s10910-017-0788-y. (8) Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., and Bourne, P. E. (2000) The protein data bank. Nucleic Acids Res. 28, 235−242. (9) Berman, H., Henrick, K., and Nakamura, H. (2003) Announcing the worldwide protein data bank. Nat. Struct. Biol. 10, 980−980. (10) Carugo, O., and Eisenhaber, F., Eds. (2016) Data Mining Techniques for the Life Sciences, 2nd ed., Springer, New York. (11) Lamb, A. L., Kappock, T. J., and Silvaggi, N. R. (2015) You are lost without a map: Navigating the sea of protein structures. Biochim. Biophys. Acta, Proteins Proteomics 1854, 258−268. (12) Martz, E. FirstGlance in Jmol, Version 2.51, http:// bioinformatics.org/firstglance/fgij/. (13) Trueblood, K. N., Burgi, H. B., Burzlaff, H., Dunitz, J. D., Gramaccioli, C. M., Schulz, H. H., Shmueli, U., and Abrahams, S. C. (1996) Atomic displacement parameter nomenclature - Report of a subcommittee on atomic displacement parameter nomenclature. Acta Crystallogr., Sect. A: Found. Crystallogr. 52, 770−781. (14) Read, R. J., Adams, P. D., Arendall, W. B., Brunger, A. T., Emsley, P., Joosten, R. P., Kleywegt, G. J., Krissinel, E. B., Lutteke, T., Otwinowski, Z., Perrakis, A., Richardson, J. S., Sheffler, W. H., Smith, J. L., Tickle, I. J., Vriend, G., and Zwart, P. H. (2011) A new generation of crystallographic validation tools for the protein data bank. Structure 19, 1395−1412. (15) Kleywegt, G. J., and Jones, T. A. (1997) Model building and refinement practice, In Macromolecular Crystallography, Pt B (Carter, C. W., and Sweet, R. M., Eds.), pp 208−230. (16) Zabrodsky, H., and Avnir, D. (1995) Continuous symmetry measures 4. Chirality. J. Am. Chem. Soc. 117, 462−473. (17) Bonjack-Shterengartz, M., and Avnir, D. (2015) The nearsymmetry of proteins. Proteins: Struct., Funct., Genet. 83, 722−734. (18) Keinan, S., and Avnir, D. (2000) Quantitative symmetry in structure-activity correlations: The near C-2 symmetry of inhibitor/ HIV protease complexes. J. Am. Chem. Soc. 122, 4378−4384. (19) Yogev-Einot, D., and Avnir, D. (2006) The temperaturedependent optical activity of quartz: from Le Chatelier to chirality measures. Tetrahedron: Asymmetry 17, 2723−2725. (20) Casanova, D., Cirera, J., Llunell, M., Alemany, P., Avnir, D., and Alvarez, S. (2004) Minimal distortion pathways in polyhedral rearrangements. J. Am. Chem. Soc. 126, 1755−1763. (21) Alvarez, S., Alemany, P., Casanova, D., Cirera, J., Llunell, M., and Avnir, D. (2005) Shape maps and polyhedral interconversion paths in transition metal chemistry. Coord. Chem. Rev. 249, 1693−1708. (22) Tuvi-Arad, I., and Avnir, D. (2012) Symmetry-enthalpy correlations in Diels-Alder reactions. Chem. - Eur. J. 18, 10014−10020. (23) Tuvi-Arad, I., and Stirling, A. (2016) The distortive nature of temperature − A symmetry analysis. Isr. J. Chem. 56, 1067−1075. (24) Jamroz, M. H., Rode, J. E., Ostrowski, S., Lipinski, P. F. J., and Dobrowolski, J. C. (2012) Chirality measures of alpha-amino acids. J. Chem. Inf. Model. 52, 1462−1479. (25) Jmol: an open-source Java viewer for chemical structures in 3D, http://www.jmol.org. (26) Kabsch, W., and Sander, C. (1983) Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577−2637. (27) Touw, W. G., Baakman, C., Black, J., te Beek, T. A. H., Krieger, E., Joosten, R. P., and Vriend, G. (2015) A series of PDB-related databanks for everyday needs. Nucleic Acids Res. 43, D364−D368.

2D-chiral with only approximate inversion symmetry; the distribution of Gly conformers among the four quadrants of the plot is unequalthe highest population of Gly residues occupies the third quadrant, and the lowest density of residues occupies the second quadrant; the highest chirality level is obtained when Gly is part of an α-helix secondary structure, where ϕ is around ±60° and ψ is around ±45°, while the lowest chirality levels are located in β-sheet structures. Being chiral, Gly can be assigned handedness using the P,M convention of helicity. Using this assignment we showed that Gly is mostly right-handed (P) when confined in α-helix structures. The level of chirality of Gly is comparable to that of all other amino acid when looking on the same common subunit. Finally, we found that energetic restrictions that are important in determining the structure of the whole protein have limited effect on the Gly chirality. As our studies demonstrated, combining the original Ramachandran plot with the CCM analysis is a powerful new tool for the structural analysis of proteins that adds valuable structural information and new insights into the fine details of proteins structures. Applications of the method to other amino acids are in progress.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.biochem.7b00525. Complete list of proteins used in this study and additional figures: A 3D representation of the Gly chiral Ramachandran plot (Figure S1), energy map of the model molecule with typical conformers (Figure S2), polar representations of chiral Ramachandran plots for Gly (Figures S3 and S4) (PDF)



AUTHOR INFORMATION

Corresponding Authors

*(I.T.-A.) E-mail: [email protected]. *(D.A.) E-mail: [email protected]. ORCID

Inbal Tuvi-Arad: 0000-0003-0418-9915 Funding

Supported by the Israel Science Foundation (Grant No. 411/ 15) and the Open University of Israel’s Research Fund (grant no. 504801). Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We are grateful to Sagiv Barhoom (The Open University of Israel) and to Itay Zandbank and Devora Witty (The Scientific Software Company, Israel) for their crucial help in programming.



REFERENCES

(1) Ramachandran, G. N., Ramakrishnan, C., and Sasisekharan, V. (1963) Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7, 95−99. (2) Ho, B. K., and Brasseur, R. (2005) The Ramachandran plots of glycine and pre-proline. BMC Struct. Biol. 5, 14. (3) Carugo, O., and Djinović-Carugo, K. D. (2013) Half a century of Ramachandran plots. Acta Crystallogr., Sect. D: Biol. Crystallogr. 69, 1333−1341. 5642

DOI: 10.1021/acs.biochem.7b00525 Biochemistry 2017, 56, 5635−5643

Article

Biochemistry (28) Yuan, Y. N., Mills, M. J. L., Popelier, P. L. A., and Jensen, F. (2014) Comprehensive analysis of energy minima of the 20 natural amino acids. J. Phys. Chem. A 118, 7876−7891. (29) Frisch, M. J., Trucks, G. W., Schlegel, H. B., Scuseria, G. E., Robb, M. A., Cheeseman, J. R., Scalmani, G., Barone, V., Mennucci, B., Petersson, G. A., Nakatsuji, H., Caricato, M., Li, X., Hratchian, H. P., Izmaylov, A. F., Bloino, J., Zheng, G., Sonnenberg, J. L., Hada, M., Ehara, M., Toyota, K., Fukuda, R., Hasegawa, J., Ishida, M., Nakajima, T., Honda, Y., Kitao, O., Nakai, H., Vreven, T., Montgomery, J. A., Jr., Peralta, J. E., Ogliaro, F., Bearpark, M., Heyd, J. J., Brothers, E., Kudin, K. N., Staroverov, V. N., Kobayashi, R., Normand, J., Raghavachari, K., Rendell, A., Burant, J. C., Iyengar, S. S., Tomasi, J., Cossi, M., Rega, N., Millam, N. J., Klene, M., Knox, J. E., Cross, J. B., Bakken, V., Adamo, C., Jaramillo, J., Gomperts, R., Stratmann, R. E., Yazyev, O., Austin, A. J., Cammi, R., Pomelli, C., Ochterski, J. W., Martin, R. L., Morokuma, K., Zakrzewski, V. G., Voth, G. A., Salvador, P., Dannenberg, J. J., Dapprich, S., Daniels, A. D., Farkas, Ö ., Foresman, J. B., Ortiz, J. V., Cioslowski, J., and Fox, D. J. (2009) Gaussian 09, Revision A.1. (30) Becke, A. D. (1993) Density-functional thermochemistry III. The role of exact exchange. J. Chem. Phys. 98, 5648. (31) Lee, C. T., Yang, W. T., and Parr, R. G. (1988) Development of the Colle-Salvetti correlation-energy formula into a functional of the electron-density. Phys. Rev. B: Condens. Matter Mater. Phys. 37, 785− 789. (32) Stephens, P. J., Devlin, F. J., Chabalowski, C. F., and Frisch, M. J. (1994) Ab-initio calculation of vibrational absorption and circulardichroism spectra using density-functional force-fields. J. Phys. Chem. 98, 11623−11627. (33) Vosko, S. H., Wilk, L., and Nusair, M. (1980) Accurate spindependent electron liquid correlation energies for local spin-density calculations - A critical analysis. Can. J. Phys. 58, 1200−1211. (34) deGroot, B. L., vanAalten, D. M. F., Scheek, R. M., Amadei, A., Vriend, G., and Berendsen, H. J. C. (1997) Prediction of protein conformational freedom from distance constraints. Proteins: Struct., Funct., Genet. 29, 240−251. (35) Moehle, K., and Hofmann, H. J. (1996) Peptides and peptoids A quantum chemical structure comparison. Biopolymers 38, 781−790. (36) Gilmour, I., and Seohton, M. A. (2004) An Introduction to Astrobiology, Cambridge University Press, Cambridge. (37) Voet, D., and Voet, J. G. (2011) Biochemistry, 4th ed., Wiley, New York. (38) Cintas, P., Ed. (2013) Biochirality: Origins, Evolution and Molecular Recognition; Springer-Verlag: Berlin Heidelberg. (39) Nic, M., Jirat, J., and Kosata, B. (2006) Cahn-Ingold-Prelog system, In IUPAC. Compendium of Chemical Terminology (the ″Gold Book″) (Jenkins, A., Ed.), XML on-line corrected version, https://doi. org/10.1351/goldbook.C00772. (40) Nic, M., Jirat, J., and Kosata, B. (2006) Helicity, In IUPAC. Compendium of Chemical Terminology (the “Gold Book”) (Jenkins, A., Ed.), XML on-line corrected version. https://doi.org/10.1351/ goldbook.H02763.

5643

DOI: 10.1021/acs.biochem.7b00525 Biochemistry 2017, 56, 5635−5643