Strength of Cα−H···OC Hydrogen Bonds in Transmembrane Proteins

The sequence pattern of GXXXG was found to facilitate formation of Cα−H···O ..... type 4 with stars, other repulsions with crosses, and regular ...
0 downloads 0 Views 236KB Size
J. Phys. Chem. B 2008, 112, 1041-1048

1041

Strength of Cr-H‚‚‚OdC Hydrogen Bonds in Transmembrane Proteins Hahnbeom Park,‡ Jungki Yoon,‡ and Chaok Seok* Department of Chemistry, College of Natural Sciences, Seoul National UniVersity, Seoul 151-747, Republic of Korea ReceiVed: September 11, 2007; In Final Form: October 28, 2007

A large number of CR-H‚‚‚O contacts are present in transmembrane protein structures, but contribution of such interactions to protein stability is still not well understood. According to previous ab initio quantum calculations, the stabilization energy of a CR-H‚‚‚O contact is about 2-3 kcal/mol. However, experimental studies on two different CR-H‚‚‚O hydrogen bonds present in transmembrane proteins lead to conclusions that one contact is only weakly stabilizing and the other is not even stabilizing. We note that most previous computational studies were on optimized geometries of isolated molecules, but the experimental measurements were on those in the structural context of transmembrane proteins. In the present study, 263 CR-H‚‚‚OdC contacts in R-helical transmembrane proteins were extracted from X-ray crystal structures, and interaction energies were calculated with quantum mechanical methods. The average stabilization energy of a CRH‚‚‚OdC interaction was computed to be 1.4 kcal/mol. About 13% of contacts were stabilizing by more than 3 kcal/mol, and about 11% were destabilizing. Analysis of the relationships between energy and structure revealed four interaction patterns: three types of attractive cases in which additional CR-H‚‚‚O or N-H‚‚‚O contact is present and a type of repulsive case in which repulsion between two carbonyl oxygen atoms occur. Contribution of CR-H‚‚‚OdC contacts to protein stability is roughly estimated to be greater than 5 kcal/mol per helix pair for about 16% of transmembrane helices but for only 3% of soluble protein helices. The contribution would be larger if CR-H‚‚‚O contacts involving side chain oxygen were also considered.

Introduction Nonconventional, weak CH‚‚‚O hydrogen bonds are ubiquitous in proteins, protein-ligand complexes, protein-protein complexes, nucleic acids, etc.1-13 In particular, transmembrane protein structures contain a large number of CR-H‚‚‚O contacts between transmembrane helices,14 which is presumably due to the short distance between helices.15,16 Transmembrane proteins have a relatively large fraction of small amino acids such as glycine, alanine, and serine, and the interhelical distances are thus shorter than in soluble proteins. The sequence pattern of GXXXG was found to facilitate formation of CR-H‚‚‚O hydrogen bonds and stabilize proteins.17 Such CR-H‚‚‚O bonds were argued to be important in determining the stability and specificity of transmembrane proteins.15 Two experiments were performed to measure the strength of CR-H‚‚‚O hydrogen bonds, but the results were conflicting: an FTIR study on the G79-CR-H‚‚‚I76-O interaction in glycophorin A gave a weak stabilization energy of 0.88 kcal/mol,18 but a mutagenesis study on the A51-CR-H‚‚‚T24-Oγ interaction in bacteriorhodopsin showed that the interaction is not stabilizing.19 A molecular mechanics study explained this conflict in terms of the interaction of the amide N atom of the hydrogendonating residue with the hydrogen-accepting O atom:20 in A51CR-H‚‚‚T24-Oγ of bacteriorhodopsin, Oγ is located closer to amide N than CR, causing repulsion, whereas in G79-CR -H‚‚‚I76-O of glycophorin A, no such repulsion is present. This indicates that the structural context in which CR-H‚‚‚O hydrogen bond lies in is critical in determining contribution of such hydrogen bonds to the stability of a protein. * To whom correspondence should be addressed. Phone: 82-2880-9197. Fax: 82-2-871-8119. E-mail: [email protected]. ‡ These authors contributed equally.

The strength of CH‚‚‚O hydrogen bonds has also been an intriguing subject of computational studies.21-28 Previous ab initio quantum mechanical calculations on CH‚‚‚O hydrogen bonds estimated the strength to be 2-3 kcal/mol, about half as strong as the conventional OH‚‚‚O hydrogen bonds:22,23 Properties of the CH‚‚‚O interaction were studied with FnH3-nCH as proton donor and H2O, CH3OH, and H2CO as acceptor, showing that the CH‚‚‚O hydrogen bond is strengthened by 1 kcal/mol with each F added to the donor.21 The strength of the CRH‚‚‚O hydrogen bond of an amino acid with water, in which the donor has two electronegative groups, was estimated to lie between 1.9 and 2.5 kcal/mol.22 Another study focused on a similar interaction occurring in N,N-dimethylformamide dimer, and the CR-H‚‚‚O hydrogen bond energy was estimated to be 3 kcal/mol.23 Studies on such small model systems enable calculation of the contribution of weak hydrogen bonds alone, without interference from other stronger interactions. However, effects of surrounding atoms in the more realistic structural context of protein environment, which naturally accompany weak hydrogen bonds, are ignored at the same time. For example, hydrogenbonding geometries were optimized without constraint of surroundings in such studies, although those geometries in proteins deviate from ideal, depending on the surrounding structure. Therefore, such calculations provide only a limiting value to the interaction energy. In the present study, CR-H‚‚‚OdC hydrogen bond energy is calculated in the context of R-helical transmembrane proteins with ab initio and density functional theory quantum mechanical calculations. The geometries of 263 CR-H‚‚‚OdC contacts were carved out of X-ray crystal structures of R-helical transmembrane proteins. Each CR-H‚‚‚OdC interaction was reduced to

10.1021/jp077285n CCC: $40.75 © 2008 American Chemical Society Published on Web 12/22/2007

1042 J. Phys. Chem. B, Vol. 112, No. 3, 2008

Figure 1. The molecular system considered in quantum calculations. The constrained geometrical variables are indicated with arrows: two backbone dihedral angles (φ and ψ angles) of the donor and the acceptor; CCRHO, CRHOC, and HOCC dihedral angles; CRHO and HOC angles; and OH distance.

an interaction of glycine dipeptide and N-methylacetamide. In this model system, three amide bonds and three R-carbon atoms were retained to account for minimal backbone structure. The internal and relative geometry of the two segments were restrained to that in the crystal structure. From the relationships of energy and geometry, three types of attractive patterns and a type of repulsive interaction pattern were identified. With these data on a large number of CR-H‚‚‚OdC contacts in realistic environments, more general conclusion could be drawn on the stability of such contacts in transmembrane proteins. Materials and Methods Structure Database of Transmembrane Proteins. Among the R-helical transmembrane proteins deposited in the PDBTM database,29 X-ray crystal structures with resolution no greater than 3 Å were extracted, and chain redundancies were removed such that sequence identity is less than 30% for each pair of the chains in the database. Only the transmembrane regions predicted by PDBTM were considered for analysis. The number of unique transmembrane proteins in this nonredundant transmembrane chain set is 41, and their pdb codes and chain IDs are provided in Table 1 of the Supporting Information. Interacting pairs of helices were extracted from the nonredundant chain set, following the definition of interacting helix pairs in ref 30: three or more residues are in contact between helices, where the contacting residues are defined as those with the shortest distance between atoms is within 0.6 Å of the sum of their van der Waals radii.15 In total, 364 transmembrane helix pairs were obtained from this procedure. Hydrogen atoms were attached with REDUCE,31 and secondary structures were assigned with DSSP.32 Helices connected by no more than two amino acids were treated as single helices. Cr-H‚‚‚OdC Contacts for Quantum Mechanical Calculations. For quantum mechanical calculations, interhelical CRH‚‚‚OdC contacts with OH distance e 4 Å and CRHO angle g 120°, or OH distance e 3.5 Å and CRHO angle g 90° were collected from the interacting helix pairs obtained above, resulting in 351 contacts. The CR-H‚‚‚O contacts involving “side chain” oxygen atoms were not considered in this study. The coordinates of five amino acids around each CR-H‚‚‚ OdC contact were extracted, and the molecular system was simplified, as in Figure 1, for efficient quantum mechanics calculations: Each CR-H‚‚‚OdC interaction was modeled as an interaction of glycine dipeptide and N-methylacetamide. The two amide bonds attached to the hydrogen donating R-carbon were retained to properly represent the electron-withdrawing environment of the R-carbon. The two carbon atoms adjacent

Park et al to the hydrogen-accepting carbonyl group were also retained to effectively represent the protein backbone of the interacting helix. The side chain group on each R-carbon was reduced to a hydrogen atom for simplicity when the hydrogen-donating group is not Gly. The existence of a side chain influences the hydrogen-bond energy, but it has been found in a previous study that the differences in CR-H‚‚‚O hydrogen-bond energy among neutral amino acids is not large: -2.5 kcal/mol for glycine, -2.3 kcal/mol for serine, -2.1 kcal/mol for alanine, -2.0 kcal/ mol for valine, and -1.9 kcal/mol for cysteine, for example.22 Therefore, only the effects of the internal backbone structure of each helix and the relative arrangement of two helices are explored, and the dependence on side chain type was not investigated in this study. In the initial set of 351 contacts, those geometries in which both R-hydrogen atoms of a Gly residue make CR-H‚‚‚OdC contacts were counted twice, but they were included only once in the final set for quantum calculations. In addition, those configurations in which an artificial CR-H‚‚‚OdC contact arises when a side chain is replaced with a hydrogen atom were not considered for quantum calculations. The total number of CRH‚‚‚OdC contacts in the final set is 263. These contacts were subject to quantum mechanical calculations. Quantum Calculations. Initial coordinates of hydrogen atoms were first optimized with AMBER96 force field fixing all heavy atoms and those reduced to hydrogen atoms at crystal coordinates. DFT and ab initio calculations in vacuum were then carried out with the GAUSSIAN03 program.33 Geometries were first optimized at the HF/6-31G level.34 The density functional MPWB1K, which is known to produce accurate results for noncovalent interactions,35 was then employed for geometry optimization and energy evaluation at the MPWB1K/6-31+G** level.34 Calculations with a more popular density functional, B3LYP at B3LYP/6-31+G** level,34 and a perturbation theory, MP2/6-31+G**,34 were also performed for a selected set of contacts for comparison. MP2/6-31+G** calculations were also performed with basis set superposition error (BSSE) corrected by the counterpoise method.36 This level of calculation has been reported to produce reasonable results when compared to higher level calculations on related systems.21 The interaction energy is calculated as the difference between the optimized energy of the donor-acceptor complex and those of the two isolated molecules. In geometry optimization, all the dihedral angles formed by heavy atoms and capped hydrogen atoms were restrained to those in the crystal structures, except for those hydrogen atoms whose locations are naturally constrained during optimization by the planarity of the peptide bonds or by the tetrahedral geometry about the sp3 carbon. Four dihedral angles (φ and ψ angles of the donor and the acceptor molecules) were fixed to constrain the internal geometry. The relative orientation of the two helices was also constrained by fixing CCRHO, CRHOC, and HOCC dihedral angles; CRHO and HOC angles; and OH distance, as indicated with arrows in Figure 1. All other variables including bond angles and bond lengths were subject to optimization. Structure Databases of Soluble Proteins. Interhelical CRH‚‚‚OdC hydrogen bonds in “soluble” proteins were also extracted from nonredundant structure databases to explore differences in contribution to stability from transmembrane proteins. The database of interacting helices of transmembrane proteins is already described above. A database of interacting helices of

CR-H‚‚‚OdC H-Bonds in Transmembrane Proteins soluble proteins was obtained from the Top500 nonredundant structure database,37 after eliminating membrane proteins and removing chain redundancies of greater than 30% sequence identity. In total, 2685 interacting helices were obtained from 479 unique soluble protein chains. Shorter lengths and less parallel orientations of helices in soluble proteins15,16 disfavor the formation of CR-H‚‚‚OdC contacts. To see whether such differences are the only major determinant for differences in CR-H‚‚‚OdC contacts, another database of interacting helices of soluble proteins homologous to transmembrane helices was also constructed. Structural similarities of the 364 interacting helix pairs of transmembrane proteins were compared with those of the 2685 helix pairs of soluble proteins with DaliLite.38,39 The helix pairs with Z-score between 2 and 4 are referred to as ‘hm_low’, and those with Z-score greater than 4 are referred to as ‘hm_high’. The Dali Z-score is obtained from the estimated mean and standard deviation of structure similarity score that depend on the size of the structures.40 Typically, a Dali Z-score of less than 2 indicates insignificant match. The numbers of the helix pairs are 116 and 106 for the hm_low and the hm_high set, respectively. Hydrogen atoms were attached, and secondary structure was assigned in the same way as for the transmembrane proteins. Results and Discussion Interaction Energies of 263 Interhelical Cr-H‚‚‚OdC Contacts. Interaction energies of 25 CR-H‚‚‚OdC contacts randomly selected from the total set were first calculated with four different methods: MPWB1K/6-31+G**, B3LYP/631+G**, and MP2/6-31+G** with and without counterpoise (CP) correction for BSSE. The full list of energies is provided in Table 2 of the Supporting Information. The well-known, large overestimation of interaction strength is observed for MP2/631+G** calculations without CP correction: the average interaction energy from MP2/6-31+G** calculations without CP correction is -3.60 kcal/mol, compared to -2.35 kcal/mol for MP2/6-31+G** calculations with CP correction. MP2/631+G** with CP is the highest level of computation performed here, but the computational cost is also the highest. Average deviation in energy from the results of MP2/6-31+G** with CP is +0.37 kcal/mol for MPWB1K/6-31+G** and +0.76 kcal/ mol for B3LYP/6-31+G**. The better performance of MPWB1K compared to B3LYP for noncovalent interactions is consistent with the previous reports.35 Because MPWB1K/6-31+G** produces reasonable interaction energy for this test set, this method is employed for the remaining CR-H‚‚‚OdC contacts. The average interaction energy of the 263 CR-H‚‚‚OdC contacts calculated at the MPWB1K/6-31+G** level is -1.6 kcal/mol. (Estimated stabilization energy per CR-H‚‚‚OdC bond is 1.4 kcal/mol after correcting for dual interactions in some of the contacts. See the last subsection of Results and Discussion for more details.) The interaction was computed to be rather weaker than the previously reported ab initio results for CR-H‚‚‚O hydrogen bonds: -2.5 kcal/mol for Gly-water complex by Scheiner22 and < -3 kcal/mol by Vargas,23 even when the slight underestimation of the strength due to the use of density functional theory is taken into account. However, this value can be considered reasonable because the CR-H‚‚‚ OdC contacts treated here are not fully optimized in geometry but constrained in protein environment. Distribution of the MPWB1K/6-31+G** interaction energy is shown in Figure 2. The energy is in the range of -5 to +2 kcal/mol; 89% of the cases are stabilizing (interaction energy

J. Phys. Chem. B, Vol. 112, No. 3, 2008 1043

Figure 2. Distribution of CR-H‚‚‚OdC interaction energy calculated at the MPWB1K/6-31+G** level.

0). About 49% of the whole set are from -2.5 to -1.0 kcal/mol. Interestingly, 13% of the cases have interaction energy < -3 kcal/mol, lower than previously reported values. This implies that there exist additional attractive interactions. The interaction energy is plotted with respect to OH distance and to CRHO angle in Figure 3,parts a and b, respectively. The tendency of the energy to decrease as the ideal geometry is approached, i.e., as OH distance decreases or as CRHO angle increase, is very weak: the correlation coefficients are 0.17 and 0.38 for Figure 3, parts a and b, respectively. This implies that the interaction energy is not a simple function of the local CRH‚‚‚O geometry but a complicated function of relative placement of nearby atoms in space. Compared to isolated, fully optimized CR-H‚‚‚O bonds, there are more attractive, less attractive, and even repulsive cases, but on average CR-H‚‚‚OdC contacts are attractive and make stabilizing contribution to transmembrane proteins. Patterns of Interhelical Cr-H‚‚‚OdC Interactions in r-Helical Transmembrane Proteins. As presented above, the interaction energy of our model system on interhelical CR-H‚‚‚OdC contacts in transmembrane proteins is not determined simply by the local CR-H‚‚‚OdC interaction geometries but is altered by interactions with surrounding atoms. Such additional interactions naturally occur in proteins when CR-H and OdC groups come close together. Because they are not separable and are part of CR-H‚‚‚OdC interactions in actual protein structures, it is meaningful to understand the CR-H‚‚‚ OdC interaction together with the surrounding atoms. Such consideration is particularly important when interpreting experimental results. It is therefore important to elucidate what types of factors influence CR-H‚‚‚OdC interaction energy in transmembrane proteins and how significant they are. Four frequent interaction patterns were discovered to exert crucial influences on the interaction energy after close inspection of the relationship between energy and geometry. Specifically, OH and NH distances, which may give rise to attractive interactions, and OO, NO, and NN distances, which may give rise to repulsive interactions, were closely examined. The four types of interactions, referred to as types 1-4, are illustrated in Figure 4a-d. As shown in Figure 4, parts a and b, interactions of types 1 and 2 involve additional attractive interaction between the oxygen atom of the hydrogen donor and one of the R-hydrogen atoms of the hydrogen acceptor. In type 1, the amino acids are oriented antiparallel, and in type 2, they are parallel. Type 3 interaction

1044 J. Phys. Chem. B, Vol. 112, No. 3, 2008

Figure 3. CR-H‚‚‚OdC interaction energy versus (a) OH distance and (b) CRHO angle. The overall correlation coefficients are 0.17 and 0.38, respectively. Type 1 is represented with open circles, type 2 with squares, type 3 with triangles, type 4 with stars, other repulsions with crosses, and regular with filled circles.

involves additional attractive interaction between the hydrogenaccepting oxygen atom and the amide hydrogen of the hydrogendonating amino acid, as illustrated in Figure 4c. In type 4 interaction, shown in Figure 4d, two carbonyl oxygen atoms come close, and the repulsive interaction interferes with the hydrogen bond. Repulsion between O and N atoms, previously reported to be an important factor determining the strength of the hydrogen bond involving the side chain Oγ of serine residue,20 was found to occur only in three cases. Note that the hydrogen bonds involving only backbone carbonyl oxygen are considered here. Repulsion between N and N atoms occurs only once. Although interatomic distance is an important factor that determines the strength of each type of interaction, it is hard to determine which type of interaction dominates in a given geometry of CR-H‚‚‚OdC contact with a simple distance criterion. This is because when two atoms of interest come closer, other pairs of atoms tend to come closer, too, and different types of interactions compete with each other as a result. For example, when O and H atoms are close in distance, O and O atoms can also be close. In this case, the relative distance of OH and OO must be considered to determine which interaction prevails. To effectively deal with this situation, a set of angles formed by three atoms that participate in competing interactions was defined for each interaction type, as illustrated

Park et al in Figure 4: OOH angle, defined to be R and β for types 1 and 2, respectively; ONH angle, γ, for type 3; and R, β, and OHO angle, δ, for type 4. In the case of attractive interactions, types 1-3, as the angles become smaller, the attractive interactions become stronger. The case of type 4 is explained below in detail. Classification of interactions into type 1, 2, or 3 is made in terms of the angles defined above with the threshold of 60°: type 1 if R < 60°, type 2 if β < 60°, and type 3 if γ < 60°. This value of threshold was chosen to minimize overlaps among different types, as can be seen from Figure 1 of the Supporting Information. The numbers of contacts satisfying the criteria for more than two types are two for types 1 and 2, two for types 1 and 3, and five for types 2 and 3. For convenience, type 3 was assigned with the highest priority, type 1 the next, and type 2 the last. As both R and β angles become larger, OO repulsion, the type 4 interaction, may arise. Another angle, the δ angle, shown in Figure 4d, represents the relative closeness of O and O compared to O and H. The criterion for type 4 interaction was therefore chosen to be R > 80°, β > 80°, and δ < 120°. Contacts with other repulsive interactions, NO and NN repulsions, were also identified, but no major type was assigned to them because they are much less frequent (three and one cases, respectively). They were simply classified as “other repulsions”. NO repulsion is defined to be in those contacts with γ > 120° that do not belong to type 1 or 2. NN repulsion is defined to exist when the NN distance is shorter than 3.0 Å. The remaining contacts not assigned to any types are termed “regular”. The 263 contacts were classified into types, and the frequency and average energy for each type are provided in Table 1. The contacts of types 1 and 2 consist of about 50% of the total set. The average energy of the type 3 interaction is comparable to those of types 1 and 2, although this interaction involves a stronger N-H‚‚‚O interaction rather than C-H‚‚‚O interaction. This is because the N-H‚‚‚O geometry deviates from ideal in protein structures, indicating that C-H‚‚‚O and N-H‚‚‚O interactions may compete with each other in protein structures in which deformations from ideal hydrogen-bond geometry are forced by the global protein structure. Note that type 1 and type 2 interactions in actual proteins would be stronger than calculated here, because the hydrogen-donating carbon atom in the additional C-H‚‚‚O interaction has only one amide group attached instead of two in the simplified model considered here. It can be concluded that the tandem interaction, as in types 1 and 2, is one of the major factors that lowers CR-H‚‚‚OdC contact energy in transmembrane helices. In addition, the OO repulsion is a major factor that destabilize CR-H‚‚‚OdC contacts. The interaction energy is plotted against OH distance and CRHO angle in Figure 3a,b, representing different types of interactions with different symbols. The attractive contacts, types 1-3, are found in the lower part of the plot, and the repulsive, type 4, contacts in the upper part. Correlation of the energy with the OH distance or the CRHO angle is stronger when each interaction type is considered separately, compared to the correlation in the total set. In Figure 3a, the correlation coefficient for each type is 0.64, 0.57, 0.62, 0.45, and 0.36 for type 1, type 2, type 3, type 4, and regular, respectively, which is much higher than that of the total set, 0.17. In Figure 3b, the correlation coefficient is 0.49, 0.68, 0.89, -0.11, and 0.42 for type 1, type 2, type 3, type 4, and regular, respectively, and that of the total set is 0.38. It is notable that CR-H‚‚‚OdC interactions can be better described in terms of interactions with

CR-H‚‚‚OdC H-Bonds in Transmembrane Proteins

J. Phys. Chem. B, Vol. 112, No. 3, 2008 1045

Figure 4. Schematic pictures of four types of interactions: (a) type 1, (b) type 2, (c) type 3, and (d) type 4. Thick dotted lines indicate the CR-H‚‚‚OdC hydrogen bond of interest and wavy lines additional attractive or repulsive interactions. The angles R, β, γ, and δ used to classify the types are shown together.

TABLE 1: Frequency of Each Type in the Set of 263 Cr-H‚‚‚OdC Contacts and the Average Interaction Energy Calculated at the MPWB1K/6-31+G** Level

set 1a

type type 2a type 3 type 4 other repulsions regular

no. of elements

proportion (%)

78 48 9 26 4 98

29.8 18.1 3.4 9.8 1.5 37.4

average energy (kcal/mol) -2.367 -2.093 -2.575 0.564 0.135 -1.298

a Estimated average energies of type 1 and type 2 interactions after correction for dual interactions are -2.016 and -1.898 kcal/mol, respectively.

surroundings and local deformations together, rather than with either one of the two factors alone. It has to be mentioned that the interaction energy, as shown in Figure 3a, is long-ranged. The CR-H‚‚‚OdC interactions even at a large OH distance of 3.5-4 Å make non-negligible contributions. These data are consistent with the previous report that CH‚‚‚O hydrogen bonds are much less sensitive to variances in hydrogen bond distances and angles, compared to conventional hydrogen bonds.21 Moreover, the number of type 1 and type 2 interactions increases as OH distance becomes larger, as shown in Figure 5. The average interaction energy for types 1 and 2 at OH distance of 3.5-4 Å is -1.91 and -1.58 kcal/ mol, respectively, lower than the average energy of -1.30 for regular interactions. If the interaction energy of type 1 or 2 is decomposed into the contribution of each CR-H‚‚‚OdC interaction, the interaction will be weaker (-1.67 and -1.39 kcal/mol, respectively, if corrected for dual interactions, as explained in the next subsection). However, the two tandem CR-H‚‚‚OdC interactions together are stronger than regular interactions, even at longer distance.

Figure 5. Fractions of each type with respect to OH distance found in the set of 263 CR-H‚‚‚OdC contacts. More type 1 and type 2 interactions are found as the OH distance becomes longer.

A major parameter that modulates the strength of the longrange type 1 interaction was determined to be the HOOH dihedral angle, referred to as , defined in Figure 6. It can be seen from Figure 7 that there is a tendency that the interaction energy decreases as || approaches 180°. The correlation coefficient of interaction energy with || is -0.46. Type 2 interaction was found to have no such correlation. One of the characteristic properties of the CH‚‚‚O hydrogen bond is contraction of the C-H bond. Bond elongation occurs in conventional hydrogen bonds. This phenomenon has been consistently observed in previous ab initio studies21,41,42,43 However, both contraction and elongation of C-H bond were found in the present study, probably because of the additional environmental effect considered here. Table 2 shows the distribution of distance change, but no simple explanation was yet found.

1046 J. Phys. Chem. B, Vol. 112, No. 3, 2008

Park et al

Figure 6. The HOOH dihedral angle, , defined for type 1 contact.

Figure 7. Energy versus the HOOH dihedral angle  for type 1 interactions.

TABLE 2: Changes in the CrH Distance in the Set of 263 Cr-H‚‚‚OdC Contacts distance range (mÅ)

no. of elements

proportion (%)

< -2.0 -2.0 to -1.0 -1.0 to -0.5 -0.5 to 0.0 0.0 to 0.5 0.5 to 1.0 1.0 to 2.0 > 2.0

5 6 15 78 92 24 22 21

1.89 2.26 5.56 29.81 35.09 9.06 8.30 7.92

Frequency and Strength of Interhelical Cr-H‚‚‚OdC Interactions in Transmembrane Proteins and Soluble Proteins. The above knowledge obtained from structure-energy analysis on a large number of geometries extracted from crystal structures was also combined with statistical analysis on crystal structure databases of transmembrane proteins and soluble proteins. Frequency maps of interhelical CR-H‚‚‚OdC contacts on the plane of OH distance and CRHO angle are presented in Figure 8, parts a and b, for transmembrane and soluble proteins, respectively. Frequency is particularly higher around (OH, CRHO) ) (2.7 Å, 120°) and (3.7 Å, 135°) in Figure 8a, referred to as “region 1” and “region 2”, respectively. The interaction energy around region 1 is the lowest, as can be seen from Figure 3. Region 2 corresponds to the region of longer OH distance of type 1 and type 2 interactions, in which interaction energy is non-negligible because of tandem C-H‚‚‚O interactions. Therefore, interhelical interactions seem to effectively exploit the lowenergy regions of CR-H‚‚‚OdC contacts, and thus to stabilize protein structures.

Figure 8. Frequency maps of CR-H‚‚‚OdC contacts as a function of OH distance and CRHO angle for (a) transmembrane proteins and (b) soluble proteins.

From Figures 3 and 8, a criterion for attractive CR-H‚‚‚Od C interaction is defined as follows: the OH distance e 4 Å and the CRHO angle g 120°, or the OH distance e 3.5 Å and the CRHO angle g 90°. In fact, this criterion was used to collect CR-H‚‚‚OdC contacts from crystal structures for quantum calculations. Although the contacts are attractive, it may be arguable whether to call those interactions with large OH distance hydrogen bonds or not, but the term hydrogen bond is still used here for convenience. For comparison, the criterion for hydrogen bond used in ref 5 is the OH distance e 3.5 Å and the CRHO angle g 120°, or the OH distance e 3.0 Å and the CRHO angle g 90°. Using the above criterion for the CR-H‚‚‚OdC hydrogen bond based on quantum calculations, the frequency of CRH‚‚‚OdC contacts was obtained for both transmembrane and soluble proteins. The average number of CR-H‚‚‚OdC hydrogen bonds per interacting helix pair is 0.92 in transmembrane proteins and 0.32 in soluble proteins. The distribution of the number of hydrogen bonds is shown in Figure 9. It is clearly seen that transmembrane protein helices make significantly more CR-H‚‚‚OdC contacts than soluble protein helices. For example, more than four CR-H‚‚‚OdC hydrogen bonds are found in about 9% of transmembrane helix pairs, while only about 2% of soluble helix pairs have more than four. All of the CR-H‚‚‚OdC contacts in the nonredundant databases of transmembrane and soluble proteins were classified into types and are compared in Table 3. It is interesting that more type 1 and type 2 interactions occur in transmembrane proteins than in soluble proteins. The distribution for transmembrane proteins is a little different from Table 1 because Table 3 includes those contacts that were not subject to quantum calculations.

CR-H‚‚‚OdC H-Bonds in Transmembrane Proteins

J. Phys. Chem. B, Vol. 112, No. 3, 2008 1047 TABLE 4: Distribution of Estimated Interaction Energy Per Helix Pair Due to Interhelical Cr-H‚‚‚OdC Contacts

Figure 9. Frequency of helix pairs versus the number of CR-H‚‚‚ OdC hydrogen bonds for transmembrane proteins (TMP) and soluble proteins (SP).

TABLE 3: Percentages of Each Type in the Nonredundant Databases of Transmembrane and Soluble Proteins

set

transmembrane proteins (%)

soluble proteins (%)

type 1 type 2 type 3 type 4 other repulsions regular

32.6 20.5 5.6 8.7 4.2 28.5

17.1 11.0 2.9 16.7 7.9 44.4

The average frequency of interhelical CR-H‚‚‚OdC hydrogen bonds in soluble proteins is about one-third of that in transmembrane proteins on average (0.32 versus 0.92). This difference may be due to the fact that lengths of the helices are short and the helices pack each other with less favorable packing angles for such contacts to form. To examine whether such effects are the major factors behind less frequent CR-H‚‚‚Od C contacts in soluble proteins, a database of soluble protein helix pairs structurally homologous to transmembrane helices was constructed. The less homologous set, hm_low set, has 116 members, and the more homologous set, hm_high set, has 106 members. The average number of the CR-H‚‚‚OdC hydrogen bonds in the hm_low set is 1.13 and 0.62 and in hm_high set is 0.80 and 0.54 for transmembrane proteins and soluble proteins, respectively. Although the average number of hydrogen bonds for soluble proteins increased to 0.62 and 0.54 from the overall average of 0.32, the occurrence is still much lower than in transmembrane proteins. This fact indicates that transmembrane helices take particularly optimal geometries for formation of CR-H‚‚‚OdC hydrogen bonds. Contribution of interhelical CR-H‚‚‚OdC interaction to protein stability is estimated using the calculated energies and the statistics collected from crystal structures as follows: for transmembrane proteins, energies of those contacts for which quantum calculations were performed were taken from the calculated values, and the estimated average energy of -1.39 kcal/mol (see below) was assigned to all other contacts in transmembrane proteins and soluble proteins. Particular caution was taken to avoid double counting of dual interactions, because there are cases in which two CR-H‚‚‚OdC hydrogen bonds arise in the model system for a single contact. Two-thirds of the calculated interaction energy was assigned to type 1 and type 2 interactions if the second C-H‚‚‚O contact satisfies the hydrogen-bond criterion (48 out of 126 cases). Two-thirds, not one-half, of the interaction energy was assigned because the second C-H‚‚‚O interaction involves carbon with only one

energy (kcal/mol)

transmembrane proteins (%)

soluble proteins (%)

> +1 -1 to +1 -3 to -1 -5 to -3 -7 to -5 -9 to -7 < -9

1.1 75.0 8.2 5.2 3.6 3.0 3.8

0 81.8 15.0 1.3 1.4 0.3 0.2

electronegative group in our model system, but regular R carbon has two electronegative groups. Note that the additivity rule presented in ref 21 is applied here. Even when the additional C-H‚‚‚O contact of type 1 or type 2 interaction does not satisfy the hydrogen-bond criterion, there still exists weak attractive interaction. They are effectively treated as an environmental effect here. Two CR-H‚‚‚OdC bonds can also arise when both R-hydrogen atoms of a single Gly residue form CR-H‚‚‚OdC bonds. In this case, one-half of the calculated energy is assigned to each bond. With these corrections, the average energy per CR-H‚‚‚OdC bond was estimated to be -1.39 kcal/mol. The above is a very crude method for estimating the “average” contribution, but the fact that the CR-H‚‚‚OdC interaction energy has a wide distribution of -5 to + 2 kcal/mol has to be considered when estimating stabilization energies of specific CR-H‚‚‚OdC contacts of interest. Soluble proteins are in high dielectric environments of water, so vacuum calculation would be a worse approximation than for transmembrane proteins. Moreover, interaction geometries would be less ideal in soluble proteins because interhelical distances are larger15 and contact angles are less favorable.14 Therefore, the stabilization energy is expected to be overestimated for soluble proteins. The estimated interaction energy due to CR-H‚‚‚OdC interaction per helix pair is summarized in Table 4. It can be seen that the contribution of CR-H‚‚‚Od C interaction is significant in transmembrane proteins compared to soluble proteins. For example, the stabilization energy is greater than 5 kcal/mol per helix pair for about 16% of transmembrane helices but for only 3% of soluble protein helices. Two helix pairs from transmembrane proteins have interaction energy greater than 17 kcal/mol, but no helix pair from soluble proteins has energy greater than 13 kcal/mol. Conclusions Interaction energies of CR-H‚‚‚OdC contacts that are frequently observed in transmembrane proteins have been calculated with quantum mechanical methods. Previous experimental studies were on only two CR-H‚‚‚O contacts, and the results were conflicting. Most previous computational studies were on optimized geometries of small isolated model systems, but the experimental measurements were on CR-H‚‚‚O interactions in transmembrane proteins with the complicating effects of the surroundings. A set of 263 CR-H‚‚‚OdC contacts from crystal structures was subject to quantum calculations in this study. A large variation in interaction energy was found, ranging from -5 to +2 kcal/mol. The average stabilization was estimated to be 1.4 kcal/mol. Such interactions are weaker than conventional hydrogen bonds but are longer range and thus less sensitive to distortions due to surrounding protein structures. Therefore, those weak hydrogen bonds can contribute in various structural environments and are important in transmembrane proteins, especially more than in soluble proteins. In general,

1048 J. Phys. Chem. B, Vol. 112, No. 3, 2008 such CH‚‚‚O hydrogen bonds may add stability in diverse structural contexts of macromolecules in which the overall structures are determined by stronger interactions. Acknowledgment. We thank Dong-seon Lee for help with the use of PDBTM. We acknowledge grants from the Seoul R&BD program and a grant from MarineBio21, Ministry of Maritime Affairs and Fisheries, Korea. Supporting Information Available: Two tables and three figures, as described in the text. This material is available free of charge via the Internet at http://pubs.acs.org. References and Notes (1) Desiraju, G. R.; Steiner, T. The Weak Hydrogen Bonds in Structural Chemistry and Biology, Oxford University Press; Oxford, UK 1999. (2) Derewenda, Z. S.; Lee, L.; Derewenda, U. J. Mol. Biol. 1995, 252, 248. (3) Weiss, M. S.; Brandl, M.; Suhnel, J.; Pal, D.; Hilgenfeld, R. Trends Biochem. Sci. 2001, 26, 521. (4) Pierce, A. C.; Sandretto K. L.; Bemis, G. W. Proteins 2002, 49, 567. (5) Sarkhel, S.; Desiraju, G. R. Proteins 2004, 54, 247. (6) Klaholz, B. P.; Moras, D. Structure 2002, 10, 1197. (7) Jiang, L.; Lai, L. J. Biol. Chem. 2002, 277, 37732. (8) Ghosh, A.; Bansal, M. J. Mol. Biol. 1999, 294, 1149-1158. (9) Wahl, M. C.; Sundarlingam, M. Trends Biochem. Sci. 1997, 22, 97. (10) Mandel-Gutfreund, Y.; Margalit, H.; Jernigan, R. L.; Zhurkin, V. B. J. Mol. Biol. 1998, 277, 1129. (11) Denessiouk, K. A.; Johnson, M. S. J. Mol. Biol. 2003, 333, 1025. (12) Bella, J.; Berman, M. J. Mol. Biol. 1996, 264, 734. (13) Manikandan, K.; Ramakumar, S. Proteins 2004, 56, 768. (14) Senes, A.; Ubarretxena, B. I.; Engelman, D. M. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 9056. (15) Gimpelev, M.; Forrest, L. R.; Murray, D.; Honig, B. Biophys. J. 2004, 87, 4075. (16) Jiang, S.; Vakser, I. A. Protein Sci. 2007, 12, 1426. (17) Kleiger, G.; Grothe, R.; Mallick, P.; Eisenberg, D. Biochemistry 2002, 41, 5990. (18) Arbely, E.; Arkin, I. T. J. Am. Chem. Soc. 2004, 126, 5362. (19) Yohannan, S.; Faham, S.; Yang, D.; Grosfeld, D.; Chamberlain, A. K.; Bowie, J. U. J. Am. Chem. Soc. 2004, 126, 2284. (20) Mottamal, M.; Lazaridis, T. Biochemistry 2005, 44, 1607. (21) Gu, Y.; Kar, T.; Scheiner, S. J. Am. Chem. Soc. 1999, 121, 9411. (22) Scheiner, S.; Kar, T.; Gu, Y. J. Biol. Chem. 2001, 276, 9832.

Park et al (23) Vargas, R.; Garza, J.; Dixon, D. A.; Hay, B. P. J. Am. Chem. Soc. 2000, 122, 4750. (24) Scheiner, S.; Grabowski, S. J.; Kar, T. J. Phys. Chem. A 2001, 105, 10607. (25) Hartmann, M.; Wetmore, S. D.; Radom L. J. Phys. Chem. A 2001, 105, 4470. (26) George, L.; Sanchez-Garcia, E.; Sander, W. J. Phys. Chem. A 2003, 107, 6850. (27) Donati, A.; Ristori, S.; Bonechi, C.; Panza, L.; Martini, G.; Rossi, C. J. Am. Chem. Soc. 2002, 124, 8778. (28) Wang, B.; Hinton, J. F.; Pulay, P. J. Phys. Chem. A 2003, 107, 4683. (29) Tusna´dy, G. E.; Doszta´nyi, Z.; Simon, I. Nucleic Acids Res. 2005, 33, D275. (30) Chothia, C.; Levitt, M.; Richardson, D. J. Mol. Biol. 1981, 145, 215. (31) Word, J. M.; Lovell, S. C.; Richardson, J. S.; Richardson, D. C. J. Mol. Biol. 1999, 285, 1735. (32) Kabsch, W.; Sander, C. Biopolymers 1983, 22, 2577. (33) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Montgomery, J. A., Jr.; Vreven, T.; Kudin, K. N.; Burant, J. C.; Millam, J. M.; Iyengar, S. S.; Tomasi, J.; Barone, V.; Mennucci, B.; Cossi, M.; Scalmani, G.; Rega, N.; Petersson, G. A.; Nakatsuji, H.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Klene, M.; Li, X.; Knox, J. E.; Hratchian, H. P.; Cross, J. B.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Ayala, P. Y.; Morokuma, K.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Zakrzewski, G.; Dapprich, S.; Daniels, A. D.; Strain, M. C.; Farkas, O.; Malick, D. K.; Rabuck, A. D.; Raghavachari, K.; Foresman, J. B.; Ortiz, J. V.; Cui, Q.; Baboul, A. G.; Clifford, S.; Cioslowski, J.; Stefanov, B. B.; Liu, G.; Liashenko, A.; Piskorz, P.; Komaromi, I.; Martin, R. L.; Fox, D. J.; Keith, T.; Al-Laham, M. A.; Peng, C. Y.; Nanayakkara, A.; Challacombe, M.; Gill, P. M. W.; Johnson, B.; Chen, W.; Wong, M. W.; Gonzalez, C.; Pople, J. A. Gaussian 03, revision B.01; Gaussian Inc.; Pittsburgh, PA, 2003. (34) Foresman, J. B.; Frisch, A. Exploring chemistry with electronic structure methods: A guide to using Gaussian; Gaussian, Inc.: Pittsburgh, PA, 1993. (35) Zhao, Y.; Truhlar, D. G. J. Phys. Chem. A 2004, 108, 6908. (36) Boys, S. F.; Bernardi, F. Mol. Phys. 1970, 19, 553. (37) http://kinemage.biochem.duke.edu/databases/top500.php. (38) Holm, L.; Sander, C. Science 1996, 273, 595. (39) Holm, L.; Park, J. Bioinformatics 1999, 16, 566. (40) Holm, L.; Sander, C. Proteins 1998, 33, 88. (41) Matsuura, H.; Yoshida, H.; Hieda, M.; Shin-ya, Y.; Harada, T.; Kei, S.; Ohno, K. J. Am. Chem. Soc. 2003, 125, 13910. (42) Masella, M.; Flament, J. P. J. Chem. Phys. 1999, 110, 7245. (43) Wu, D. Y.; Ren, Y.; Wang, X.; Tian, A. M.; Wong, N. B.; Li, W. K. THEOCHEM 1999, 459, 171.