Phosphorus Binding Sites in Proteins: Structural Preorganization and

Jan 9, 2014 - ... of 8307 structures obtained from the RCSB Protein Data Bank (PDB). ... residue distribution is dictated mainly by the first shell of...
0 downloads 0 Views 2MB Size
Article pubs.acs.org/JPCB

Phosphorus Binding Sites in Proteins: Structural Preorganization and Coordination Mathias Gruber, Per Greisen, Jr., Caroline M. Junker, and Claus Hélix-Nielsen* The Biomimetic Membrane Group, Department of Physics, Technical University of Denmark, DK 2800 Kongens Lyngby Denmark S Supporting Information *

ABSTRACT: Phosphorus is a ubiquitous element of the cell, which is found throughout numerous key molecules related to cell structure, energy and information storage and transfer, and a diverse array of other cellular functions. In this work, we adopt an approach often used for characterizing metal binding and selectivity of metalloproteins in terms of interactions in a first shell (direct residue interactions with the metal) and a second shell (residue interactions with first shell residues) and use it to characterize binding of phosphorus compounds. Similar analyses of binding have previously been limited to individual structures that bind to phosphate groups; here, we investigate a total of 8307 structures obtained from the RCSB Protein Data Bank (PDB). An analysis of the binding site amino acid propensities reveals very characteristic first shell residue distributions, which are found to be influenced by the characteristics of the phosphorus compound and by the presence of cobound cations. The second shell, which supports the coordinating residues in the first shell, is found to consist mainly of protein backbone groups. Our results show how the second shell residue distribution is dictated mainly by the first shell of the binding site, especially by cobound cations and that the main function of the second shell is to stabilize the first shell residues.



affinity,4,9−11 and in other ways fine-tuning the inner environment of the binding site.12,13 We, therefore, speculated if second shell interactions also could play a role for binding of phosphorus compounds in proteins as variations in second shell residues or backbone structures may result in changes in the physicochemical properties of the binding site in analogy with previous observations for metalloproteins.14 Here, we address this question by performing a structural survey of all structures that bind phosphate groups in the RCSB Protein Data Bank, looking at both first and second shell interactions. We define first shell interactions as direct interactions between the phosphate groups and the amino acid residues or the protein backbone. Second shell interactions refer to the interactions between the protein and first shell moieties (see Figure 1). We analyzed the entire data set en bloc as well as in subsets classified by protein function (i.e., enzyme versus nonenzyme) and type of phosphorus compound (e.g., inorganic phosphate, nucleotide, etc.). From the analyses performed, we identified the general tendencies seen in binding sites for phosphorus compounds, the influence of different types of phosphorus compounds on the binding characteristics of the first and second shell, the relevance of different residue interactions in the binding site, the importance of second shell interactions in recognition, and finally the involvement of cobound cations on binding characteristics.

INTRODUCTION Specific binding of phosphate groups in proteins has been widely studied because it is essential for a large number of functions and pathways within cells.1−3 The phosphate group is prominent in the phospholipid molecules constituting the lipid bilayer of cellular membranes in the adenylate energy transporter molecules adenosine mono-, di-, and triphosphate (AMP, ADP, ATP, respectively), and it constitutes ∼9% of the mass of the nucleic acids in deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) molecules.1 In fact, nearly 20% of all proteins found in the RCSB Protein Data Bank interact with phosphate, be it in its inorganic form or bound in an organic moiety such as a nucleotide. In structural analyses of phosphate recognition, the focus has so far been on direct interactions between, for example, the phosphate anion and protein amino acids and their relation to associated specific binding motifs (e.g., Rossmann fold, glycine rich sequences, and P-loops).3 However, the recognition and coordination of phosphorus compounds in proteins may involve additional interactions that reach beyond first shell interactions. The “shell” terminology originates from structural analyses of metalloproteins in which the metal ion is coordinated by direct interactions not only with first shell ligands, the term ligand referring to backbone moieties, or specific amino acid side chains but also indirectly (via protein− protein interactions) with second shell ligands.4,5 The importance of second shell interactions for protein− metal recognition and metalloprotein function is exemplified in protecting and shielding a binding site core,6 for stabilization of the binding site complex,7,8 for enhancing binding site © 2014 American Chemical Society

Received: August 30, 2013 Revised: December 12, 2013 Published: January 9, 2014 1207

dx.doi.org/10.1021/jp408689x | J. Phys. Chem. B 2014, 118, 1207−1215

The Journal of Physical Chemistry B

Article

counting noninteracting residues simply for being within proximity of each other. Water was not counted as a second shell residue, as it is expected that water may commonly be found in the second shell near protein surfaces, where it does not play any structural or catalytic roles.8 To obtain error estimates on the calculated residue distributions, we divided the data sets into blocks of 100 structures and calculated block averages and standard deviations. The choice of block size was based on the dependency of the error estimates on the block size for the different residues (see Supporting Information Figure S1). Error estimates were only calculated for data sets with more than 1000 structures and were not calculated for the smaller subsets that did not contain enough structures for proper statistics. Magnesium was found as the most predominant metal cation in the first shell; therefore, the analysis was extended to counting the number of O, N, and S atoms within 3.5 Å of Mg2+ ions.



Figure 1. Phosphate ion bound in a protein. Phosphate is bound to Arg in the first solvation shell, indicated by the black dashed line, and to Glu in the second shell, indicated by the pink dashed line. These lines represent only two interactions out of many other side chain and backbone interactions between the protein and phosphate (PDB ID: 3FWP).

RESULTS First Shell Residue Interactions. Out of the 85 848 structures present in the PDB on the November 5, 2012, 19 604 matched the structural features searched for in this study. Out of these structures, only structures binding to the six groups of phosphorus compounds listed in Table 1 were selected. The complete data set comprised 8307 structures in total, where 8240 were determined by X-ray crystallography, 27 by NMR, and 40 by electron microscopy.



MATERIALS AND METHODS The Protein Data Bank was surveyed for structures containing phosphorus ligands with a resolution below 3.0 Å. Structures with more than 90% sequence identity were removed from the data set, and the remaining structures were divided by their type of phosphorus compound into six groups as follows: phosphate (2812 structures), pyrophosphate (162 structures), nucleoside monophosphates (NMP, 537 structures), nucleoside diphosphates (NDP, 1809 structures), nucleoside triphosphates (NTP, 1022 structures), and the coenzymes nicotinamide adenine dinucleotide and flavin adenine dinucleotide (FAD/FADH + NAD/NADH, 1965 structures). In the case of oligomeric proteins, the analysis was restricted to one of the homologous subunits, as these are assumed to have an identical mode of binding the phosphorus compounds.3 For PDB files containing an ensemble of structures as determined by NMR, only the first model present in the file was used. Separation of the data set into enzymes and nonenzymes was established by using the enzyme classification (EC) number information in the PDB files. We defined a first solvation shell as all protein residues containing O, N, or S atoms within a 3.5 Å cutoff distance from the O atoms of the phosphorus atom. This definition was used in order to capture all H-bond donors (HO, HN, HS) and electrostatic interactions in one search.3,8 Besides amino acid residues, water molecules and metals were also included in the first shell because they often have important structural functions.3,8,15,16 The O, N, and S atoms of the first solvation shell were used as first shell centers for defining the second solvation shell. Amino acid residues with an O, N, or S atom within a 3.5 Å cutoff distance of a first shell center was counted as second shell residues. When counting second shell residues, it was ensured that all backbone moieties in the second shell identified by interactions with backbone moieties in the first shell were only included if the residues in question were placed at least one amino acid apart in the protein. This was done to avoid

Table 1. How the Investigated Data Set Is Separated into Different Classes in Terms of Which Phosphorous Compounds the Individual Structures Bind phosphorus compound

PO43−

NMP

NDP

NTP

pyrophosphate

NAD/ FAD

PDB files

2812

537

1809

1022

162

1965

We start out by analyzing the first shell amino acid occurrence pattern for the entire data set, which is shown in Figure 2. The bar graph shows the percentage of the various amino acids that participate in first shell recognition of phosphate groups in the data set. The high occurrence of Gly residues is in agreement with Gly-rich loops being important binding motifs for phosphate groups, that is, the consensus sequence of the so-called P-loop, which is commonly found in ATP and GTP binding proteins.17,18 The polar residues Ser and Thr are also strongly represented, both through backbone and side chain interactions. The frequency of Tyr is found to be about 70% lower than that of Ser and Thr, despite the polar properties of its hydroxyl group. Steric factors likely leads to a preference for smaller Ser and Thr over the more bulky Tyr residue.19 In cases where the binding site is close to the surface, the hydrophobicity of the phenyl group in Tyr would also make it unfavorable and thereby explain its lower frequency. The positively charged amino acids Lys and Arg both have a high occurrence, which is expected because of the anionic nature of the phosphate moieties. Finally, the side chains of Asp and Glu also seem to be of importance for binding. Their role can, on one hand, be attributed to ion pairing with cations cobound with the phosphate moiety, but it has also been argued that they act to form a negative environment in the binding site 1208

dx.doi.org/10.1021/jp408689x | J. Phys. Chem. B 2014, 118, 1207−1215

The Journal of Physical Chemistry B

Article

Figure 2. Frequency distribution of first shell amino acid residues (one-letter codes) in binding sites for phosphorus compounds for the full data set of 8307 protein structures. Residues with side chains capable of interacting with the phosphate groups are shown twice, with the first histogram bar referring to the side-chain interaction and the second histogram bar referring to the backbone interaction. The residues are grouped and color-coded as follows: gray (apolar), green (polar), blue (basic), red (acidic) and yellow (water). Error bars are based on block averaging of the data set with a block size of 100 structures.

Figure 3. Frequency distribution of second shell amino acid residues (one-letter codes) in binding sites for phosphorus compounds for the full data set of 8307 protein structures. The residues are grouped and color-coded as follows: gray (apolar), green (polar), blue (basic) and red (acidic). Error bars are based on block averaging of the data set with a block size of 100 structures.

conformational entropy of the first shell residues and aid with the formation of a preorganized binding site. To obtain more information about the distribution seen in Figure 3, we turn to analyzing the distribution of second shell residues around some of the most predominant first shell residues in the presence and absence of cations (see Figure 4). Out of the 8307 structures in the data set, 1661 structures contain phosphate groups complexed directly with metal cations in the first shell and an additional 486 of the structures contain metal cations only in the second shell of the phosphorus compound. This implies that approximately 26% of the structures in the data set contain one or more cations bound in close proximity to the phosphorus compound, which is consistent with what has previously been observed for phosphate-binding structures.3 Second Shell Interactions with First Shell Asp/Glu Residues. The most common second shell partner for Asp and Glu is the backbone amide group that hydrogen bonds with the carboxylate oxygen of the first shell Asp/Glu residues (see Figure 4a−b). The carboxylate oxygens of Asp/Glu are capable of forming salt bridges with Arg/Lys residues in the second shell, which explains the high frequency of these positive residues in Figure 4a−b. The high frequency of second shell Asp/Glu residues interacting with first shell Asp/Glu residues might come as a surprise because one would expect a tendency toward mutual repulsion between the negatively charged side chains. Given a large and diverse data set of proteins from different organisms, the variation in effective pKa values may, however, result in many of the residues being protonated, thus allowing first and second shell Glu and Asp residues to interact through hydrogen bonds or their interactions to be mediated through water molecules.

suitable for discriminating between different protonation stages of phosphate groups.18,20 The bulky nonpolar residues were generally found to have low interaction frequencies, which agrees with earlier observations.21 Finally, we note that water molecules are present as a predominant component in the first shell for all phosphorus compounds. The high frequency of water molecules reflects the fact that desolvation of phosphate requires a substantial amount of energy (e.g., the Gibbs energy of formation of the phosphate ion in solution is 243 kcal/ mol).22 The results presented in Figure 2 are in excellent agreement with what was found earlier by Hirsch et al. (2007) despite the fact that their data set was limited to phosphate groups bound to C atoms (i.e., they excluded structures binding free phosphate ions).3 Thus, the observed pattern seems to be a general feature of proteins that bind phosphorus compounds. Second Shell Residue Interactions. We now analyze the complete distribution of second shell residues in the entire data set, which is shown in Figure 3. The number of residues in the second shell is generally lower than in the first shell, and the average number ratio of first shell to second shell residues of the entire set was found to be 1.64:1. This means that not all first shell residues have a second shell partner in the protein matrix, which is consistent with what was has been found for metal binding sites.8 Similarly to the first shell, the distribution of amino acids in the second shell shows a high occurrence of Gly residues. The positive amino acids Arg and Lys are much less frequent in the second shell compared to the first shell, and instead, a higher frequency of the negative acids Asp and Glu is observed. This shows that Asp and Glu in the second shell play stabilizing roles, mainly by charge−charge and charge-dipole interactions with first shell residues, which will lower the 1209

dx.doi.org/10.1021/jp408689x | J. Phys. Chem. B 2014, 118, 1207−1215

The Journal of Physical Chemistry B

Article

Figure 4. Frequency distribution of second shell residues found bounded to different first shell residues: (a) Asp, (b) Glu, (c) Arg, (d) Lys, (e) water, and (f) backbone. The data set is separated into structures with a metal cation in either the first or second shell of the binding site (full color bars) and structures where no metal complexation takes place (dashed bars). All backbone interactions have been pooled together in “BKB”. Bars are color-coded as follows: gray (apolar), green (polar), blue (basic), and red (acidic).

binding protein (PBP), the possibility for a hydrogen bond between the dibasic phosphate ion (HPO42−) and an Asp residue results in PBP being able to bind the phosphate ion five orders of magnitude more tightly than the sulfate ion (SO42−), which is repulsed from the negative cavity.23 Second Shell Interactions with First Shell Arg/Lys Residues. Having looked at the second shell around the negative residues Asp and Glu, we now turn to look at the second shell around the positive amino acids Arg and Lys, which are often found in the first solvation shell of binding sites for phosphorus compounds. In Figure 4c−d, it is seen that Arg and Lys predominantly partner with backbone groups in the second shell. As expected, the mutual repulsion between positive residues in the first and second shell means that virtually no amount of Arg or Lys is observed in the second

Additionally, the high frequency of Asp/Glu residues in Figure 4a−b can partly be explained by the presence of metal cations, such that both first shell and second shell Asp/Glu residues are stabilized by interaction with metal cations (see Supporting Information Figure S3); in such cases, both residues interact with the cation and not each other. The present method of mining a large database of structures is inherently susceptible to noise from special cases such as this, where first shell and second shell partners are counted only because of their proximity to each other and not because an interaction is taking place. The high frequency of negatively charged residues is consistent with previous observations that phosphate-binding proteins use negatively charged binding cavities to discriminate between substrates.20 For example, in the case of phosphate1210

dx.doi.org/10.1021/jp408689x | J. Phys. Chem. B 2014, 118, 1207−1215

The Journal of Physical Chemistry B

Article

Figure 5. First shell (a) and second shell (b) analysis of different common phosphate compounds showing residues involved in binding. “BKB” refers to all backbone interactions. The moieties are: phosphate (PO4, light blue), nucleoside monophosphates (NMP, green), nucleoside diphosphates (NDP, light green), nucleoside triphosphates (NTP, yellow), pyrophosphate (PP, orange), and coenzymes (FAD and NAD, red).

shell. The positive residues of the first shell have a high tendency to interact with negative Glu and Asp residues in the second shell through salt bridges, and thus, along with the backbone, Glu and Asp are seen to constitute the main second shell support for the Arg and Lys residues. Second Shell Residue Interactions with First Shell Water Molecules. Given that water was found to be a very common first shell ligand (Figure 2), the second shell residue distribution around first shell water molecules was also investigated, and the result is shown in Figure 4e. The main second shell residue for interaction with first shell water molecules is the protein backbone, both in the presence and absence of cations. The high frequency of water molecules in the first solvation shell highlights the importance of water in mediating interactions between phosphate groups and the binding site.24,25 The second most frequent second shell partners for water in the first shell are Asp and Glu. The frequencies of these negatively charged residues are largely dictated by the presence or absence of cations (∼30% vs ∼7%), which indicates that the main reason for the presence of Asp and Glu residues in Figure 4e is stabilization of metal cations and not indirect interactions through water molecules with phosphate groups. The frequency of neutral and positive second shell amino acids around first shell water were found not to be influenced by the presence of cations in the binding site.

Second Shell Interactions with First Shell Backbone Amides. One of the most frequent first shell residues involved in binding of phosphate groups is the amide group (CONH) of the protein backbone. When examining the second shell residues for protein backbone groups in the first shell (see Figure 4f), we observe that these consist almost exclusively of protein backbone residues (i.e., there is a high occurrence of backbone−backbone interactions in the binding sites). For these interactions, the first shell backbone moieties interact predominantly (∼80% of the interactions) via their N−H group with the carbonyl oxygen of the second shell backbone partners (the first shell backbone moieties then interact via their nitrogen lone pair with the phosphorus compound). A similarly high frequency of backbone−backbone interactions is observed in the literature for metal binding sites.8 The cause for the high frequency of backbone moieties in both the first shell and the second shell is likely related to their universality (i.e., a backbone group can partner in principle with any type of residue acting either as a hydrogen-bond acceptor or donor), and the results show how it is the folding of the protein backbone that provides the main support for the coordination of first shell residues in the binding site. First and Second Shell Distributions for Different Phosphorus Compounds. So far, our analysis has focused on the first and second shell interactions of the full data set. The data set contained seven different common phosphorus 1211

dx.doi.org/10.1021/jp408689x | J. Phys. Chem. B 2014, 118, 1207−1215

The Journal of Physical Chemistry B

Article

high Arg frequency and absence of cations) correspond well with investigations of the protein−coenzyme interactions found in the literature.27,28 No noteworthy differences in the frequency of water in the first shell were observed for the different phosphorus compounds. Looking at the overall tendencies of Figure 5, it seems that the distribution of second shell residues for different phosphorus compounds is more influenced by the tendency of the structures to cobind cations than by the structure and type of the phosphorus compound. Cations are usually found in the first solvation shell of the phosphorus compound binding sites and, as such, have a stronger influence than phosphate groups on the second shell residues because of their closer proximity. For the first shell residues, it is clear that each phosphorus compound has its own “fingerprint”, which may be more or less unique compared to the other classes, but such characteristic differences are much less apparent for the second shell residues. It is, however, evident from Figure 5b that there is a clear preference for certain residues in the second shell of binding sites. This indicates a general tendency for how the binding sites are preorganized in the proteins and how first shell residues are backed up by second shell residues. Solely on the basis of the amount of first and second shell residues counted in the analysis, it is found that ∼60% of the first shell residues are backed up by second shell residues in the protein. Further investigations of individual first shell residues show how secondary residues in the protein back up these residues (see Table 2).

compounds, and we now turn our attention to the differences between these individual compounds and show their first and second shell binding characteristics in Figure 5. Looking at the first shell residue distribution of structures binding the phosphate ion (PO43−) in Figure 5, it is seen that there is a preference for side-chain interactions over backbone interactions. Especially, the interactions with the positive residues Lys and Arg are important, but a relatively high frequency is also observed for Asp and Glu residues. The Asp and Glu residues are observed despite the relative low occurrence of positive cations, indicating that their presence is actually related to phosphate interactions and not only to cation stabilization. Looking at the first shell for the monophosphates (NMP), their binding characteristics are highly similar to that of the phosphate ion, showing that the attached nucleoside moiety does not change the binding characteristics of phosphate group. Addition of more phosphate groups, however, such as in NDP and NTP, greatly changes the binding characteristics: when going from NMP to NDP or NTP, an increased preference for Lys and a diminished preference for His and Arg is observed. It has previously been postulated that Lys is important for the stabilization of β- and γ-phosphates.3 It should furthermore be noted that Lys is highly conserved in the consensus P-loop sequence.26 For NDP and NTP, backbone interactions are more frequent than in NMP. Steric factors along with the increased frequency of cobound cations are presumed to be responsible for the shift in distribution of positive residues interacting with the nucleotides. Looking at the second shell binding characteristics in Figure 5b, it is seen that the residue distributions of the nucleotides are more or less identical to that of the phosphate ion PO43− and that no big differences exist between the nucleotides NMP, NDP, and NTP. This is interesting considering the differences in first shell for these phosphorus compounds observed in Figure 5a. It does not seem like the second shell distribution of the investigated nucleotide-binding structures is important for any selectivity toward the number of phosphate groups in the bound nucleotide. The first shell binding characteristics for pyrophosphate are found to be very similar to that of NMP and PO4, except the frequency of backbone residues is smaller and there is an increased preference for the positive residue Lys. It is seen in Figure 5 that pyrophosphate binding by proteins is often assisted by metal cations, which is marked by an exceptional high frequency of Mg2+ ions. Consequently, the pyrophosphate-binding structures are also found to contain larger amounts of Asp and Glu residues in the second shell compared to the nucleotides and PO43− binding structures. Another part of the explanation for the Asp and Gly presence is that pyrophosphate has a pKa of 6.70,22 meaning it will be protonated at pH 7 and thus capable of interacting with Asp and Glu residues. The coenzymes NAD and FAD are both dinucleotides, meaning they contain two nucleosides linked by a pyrophosphate moiety. They are often involved in redox reactions (i.e., NAD+/NADH and FAD+/FADH). First shell binding characteristics for NAD and FAD are, however, found to differ from pyrophosphate; practically none of the charged amino acids, except for Arg, are found interacting with the pyrophosphate moiety of NAD/FAD, and it is found that the binding is rarely assisted by any cations in the first and second shell of the pyrophosphate moiety. These characteristics (i.e.,

Table 2. Propensities for First Shell Residues To Be Backed Up by Second Shell Interactionsa first shell residue

Arg

Lys

His

Asp

Glu

BKB

HOH

propensity for second shell backup

0.49

0.72

0.22

0.40

0.42

0.24

1.05

a

The propensities are calculated as the total amount of second shell residues in the data set interacting with these first shell residues divided by the individual amount of first shell residues.



DISCUSSION From the abundant amount of structural information present in the PDB, we have analyzed the statistics of how proteins in nature bind to phosphorus compounds. A previous study by Hirsch et al. has addressed interactions within the first solvation shell of phosphate group binding sites.3 In this study, we extended their approach by investigating how second shell structural features influence the binding characteristics. Despite our larger (the data set used by Hirsch et al. contained 3003 structures) and less restrictive data set, the first shell binding characteristics were found to be nearly identical to what has been reported by Hirsch et al. In our analysis, we included all structures in the PDB database that bind to specific phosphorus compounds, and the only inclusion criterion when building our database was the presence of a ligand with a phosphate group in the PDB file. A consequence of this liberal approach is that our data set may contain some degree of noise, for example, from the presence of free phosphates that may have cocrystallized in locations that do not represent binding sites simply because of the crystallization conditions used. Despite the fact that our data set included structures where phosphate from the crystallization 1212

dx.doi.org/10.1021/jp408689x | J. Phys. Chem. B 2014, 118, 1207−1215

The Journal of Physical Chemistry B

Article

shell residues (see Figure 4e). This stands in contrast to metalloenzymes, where water is often found to be ionized or polarized by charged residues for subsequent hydrolytic reaction,12,13 which in statistical surveys is seen as a predominant frequency of negative amino acids in the second shell of various metal cations.8 In the case of enzymes that bind phosphorus compounds, the presence of charged residues that polarize water molecules for subsequent hydrolytic reaction has similarly been reported in certain enzymes (e.g., for pyrophosphatase hNUDT515 and phosphorylcholine phosphatase).16 Separating the data set into enzymes and nonenzymes, no significant difference in the frequency of negative or positive amino acids was seen for enzymes when compared to nonenzymes (see Supporting Information, Figure S7). In the enzyme data set, only 28% of the structures contained cobound cations, whereas in the nonenzyme data set, 43% of the structures had cobound cations. The high frequency of Asp and Glu residues in the enzyme data set must therefore be seen in the light of the fact that these are present despite a lower frequency of cations, for which one may otherwise have expected a decreased frequency of Asp and Glu residues. This example of separating the data set into enzymes and nonenzymes shows that one must be careful when extracting information from database surveys; some structural effects and features may simply be diluted or hidden by a large data set. To test whether certain structures were overrepresented in our data set, we performed our analyses on data sets where homologous structures were removed at 30, 50, 70, and 90% similarity cutoffs (see Supporting Information, Figure S8), confirming these data sets did not significantly change any of the observed trends. Given that the second shell residue distribution profiles for the binding sites of different phosphorus compounds look similar (see Figure 5) and that they are mainly affected by the presence or absence of cations, it does not seem like the second shell contributes to the selectivity of the binding site. There is, however, a characteristic distribution of residues in the second shell that is more or less conserved for different compounds. The selectivity for different phosphorus compounds can, thus, be attributed mainly to first shell interactions within the protein. The second shell layer, in general, serves the function of stabilizing or protecting the inner-core structure, and the second shell residues are of such nature that favorable interactions can occur with the respective first shell residues.

buffer has simply cocrystallized with the structure, the distribution of first and second shell ligands for the phosphate part of our data set is found to be similar to that of the other phosphorus compounds. All our results must, therefore, be reviewed in the light of the fact that they represent large data sets that have been defined only by the presence of certain phosphorus compounds. More accurate residue distribution tendencies are likely to be revealed in data sets constructed from applying more specific structural or functional properties. Another essential factor related to the data sets investigated here is the variation in the effective pKa-values of the individual amino acid residues and phosphorus compounds. The protonation state of these individual chemical groups will depend on the crystallization conditions used, and the experimental conditions thus affect the residue distributions. Differences in the distributions of first shell residues in the binding sites for different phosphorus compounds were observed. These differences represent the overall tendency of some phosphorus compounds to prefer certain residues to others. As such, the first shell residue distributions represent a more or less unique “fingerprint” that identifies the type of compound bound. It is noted that it has previously been shown that more structural classifications of the proteins (e.g., by helixtype and nonhelix-type binding sites) reveal even more pronounced differences in residue propensities of the binding site.21 Approximately 26% of the structures in this study contained positively charged cations in either the first or second solvation shell. The presence or absence of these cations was found to greatly influence the distribution of charged residues found in the binding sites. At the same time, the occurrence of cations was found to be strongly dependent on the phosphorus compound in question; for example, for the nucleotides, the frequency of all cations follows the trend NTP > NDP > NMP, representing the increase in total negative charge of the nucleotides, and virtually no cations (only ∼2% of the structures) were found cobound with the coenzymes NAD and FAD. The propensity to cobind cations seems to be highly dependent on the type of compound being bound, and the presence or absence of these cations in turn influence the first shell residue distribution. Negative amino acids in the second shell were found to play an important role for stabilizing the first solvation shell of the binding sites. The first solvation shell on the other hand contained more positive amino acids, consistent with the phosphate group substrates being anionic. Looking at both the first and second solvation shell, the overall observed high frequency of negative residues is consistent with previous observations for phosphate and sulfate binding proteins, where it is proposed to increase the discrimination between different negative substrates.29 This selectivity occurs because any substrate that does not fit perfectly into the binding cavity will be rejected by the negative environment.20,30 It should be kept in mind that the observed high frequency of negatively charged amino acids is, in part, also caused by cobound cations, which are stabilized through charge−charge interactions with the negative residues. Throughout all of the second shell distribution profiles presented in Figures 3−5, backbone residues were found to be highly predominant. The explanation for this is likely the universal ability of these residues to interact with a large spectrum of first shell residues through either their carbonyl oxygens or amide protons.8 It was found that first shell water has no clear preference for negative, positive, or neutral second



CONCLUSIONS In this study, we present a statistical analysis of binding sites for phosphorus compounds in 8307 protein structures selected by their ability to bind specific classes of phosphorus compounds. Of the structures investigated, a remarkably high 74% of the structures were found to bind the phosphorus compounds without the assistance of metal ions. Despite the relatively small amount of structures where the phosphorus compounds cobind with metal ions, it was found that these structures are very influential on the overall binding characteristics of the entire data set, most profoundly by influencing the frequency and distribution of charged residues in both the first and second shell of the binding site. A very characteristic and conserved residue distribution is observed for the second shell, which hints at its importance for stabilizing the binding site. The distributions of first shell residues revealed that some more or less unique binding characteristics may apply to different phosphorus compounds. These characteristics are, however, 1213

dx.doi.org/10.1021/jp408689x | J. Phys. Chem. B 2014, 118, 1207−1215

The Journal of Physical Chemistry B

Article

(8) Dudev, T.; Lin, Y.; Dudev, M.; Lim, C. First Second Shell Interactions in Metal Binding Sites in Proteins: a PDB Survey and DFT/CDM Calculations. J. Am. Chem. Soc. 2003, 125, 3168−80. (9) He, Q.; Mason, A. B.; Woodworth, R. C.; Tam, B. M.; Macgillivray, R. T. A.; Grady, J. K.; Chasteen, N. D. Mutations at Nonliganding Residues Tyr-85 and Glu-83 in the N-Lobe of Human Serum Transferrin. J. Biol. Chem. 1998, 273, 17018−17024. (10) Mertz, P.; Yu, L.; Sikkink, R.; Rusnak, F. Kinetic and Spectroscopic Analyses of Mutants of a Conserved Histidine in the Metallophosphatases Calcineurin and Lambda Protein Phosphatase. J. Biol. Chem. 1997, 272, 21296−302. (11) Variants, A.; Ditusa, C. A.; Mccall, K. A.; Christensen, T.; Mahapatro, M.; Fierke, C. A.; Toone, E. J. Thermodynamics of Metal Ion Binding . 2 . Metal Ion Binding by Carbonic. Biochemistry 2001, 40, 5345−5351. (12) Christianson, D. W.; Cox, J. D. Catalysis by Metal-Activated Hydroxide in Zinc and Manganese Metalloenzymes. Annu. Rev. Biochem. 1999, 68, 33−57. (13) Lipscomb, W. N.; Sträter, N. Recent Advances in Zinc Enzymology. Chem. Rev. 1996, 96, 2375−2434. (14) Ebert, J.; Altman, R. Robust Recognition of Zinc Binding Sites in Proteins. Protein Sci. 2008, 54−65. (15) Zha, M.; Guo, Q.; Zhang, Y.; Yu, B.; Ou, Y.; Zhong, C.; Ding, J. Molecular Mechanism of ADP-Ribose Hydrolysis by Human NUDT5 from Structural and Kinetic Studies. J. Mol. Biol. 2008, 379, 568−78. (16) Infantes, L.; Otero, L. H.; Beassoni, P. R.; Boetsch, C.; Lisa, A. T.; Domenech, C. E.; Albert, A. The Structural Domains of Pseudomonas Aeruginosa Phosphorylcholine Phosphatase Cooperate in Substrate Hydrolysis: 3D Structure and Enzymatic Mechanism. J. Mol. Biol. 2012, 423, 503−14. (17) Bianchi, A.; Giorgi, C.; Ruzza, P.; Toniolo, C.; Milner-White, E. J. A Synthetic Hexapeptide Designed to Resemble a Proteinaceous PLoop Nest Is Shown to Bind Inorganic Phosphate. Proteins 2012, 80, 1418−24. (18) Guimarães, C. R. W.; Rai, B. K.; Munchhof, M. J.; Liu, S.; Wang, J.; Bhattacharya, S. K.; Buckbinder, L. Understanding the Impact of the P-Loop Conformation on Kinase Selectivity. J. Chem. Inf. Model. 2011, 51, 1199−204. (19) Darby, N. J.; Creighton, T. E. Protein Structure; IRL Press at Oxford University Press: Oxford, U.K., 1993. (20) Morales, R.; Berna, A.; Carpentier, P.; Contreras-Martel, C.; Renault, F.; Nicodeme, M.; Chesne-Seck, M.-L.; Bernier, F.; Dupuy, J.; Schaeffer, C.; et al. Serendipitous Discovery and X-Ray Structure of a Human Phosphate Binding Apolipoprotein. Structure 2006, 14, 601−9. (21) Copley, R. R.; Barton, G. J. A Structural Analysis of Phosphate and Sulphate Binding Sites in Proteins - Estimation of Propensities for Binding and Conservation of Phosphate Binding Sites. J. Mol. Biol. 1994, 242, 321−329. (22) CRC Handbook of Chemistry and Physics, 88th ed.; Lide, D. R., Ed.; CRC Press: Boca Raton, FL, 2007. (23) Leucke, H.; Quiocho, F. A. High Specificity of a Phosphate Transport Protein Determined by Hydrogen Bonds. Nature 1990, 347, 402−406. (24) Levy, Y.; Onuchic, J. N. Water Mediation in Protein Folding and Molecular Recognition. Annu. Rev. Biophys. Biomol. Struct. 2006, 35, 389−415. (25) Baron, R.; McCammon, J. A. Molecular Recognition and Ligand Association. Annu. Rev. Phys. Chem. 2012, 64, 151−175. (26) Leipe, D. D.; Koonin, E. V.; Aravind, L. Evolution and Classification of P-Loop Kinases and Related Proteins. J. Mol. Biol. 2003, 333, 781−815. (27) Carugo, O.; Argos, P. NADP-Dependent Enzymes. I: Conserved Stereochemistry of Cofactor Binding. Proteins 1997, 28, 10−28. (28) Giangreco, I.; Packer, M. J. Pharmacophore Binding Motifs for Nicotinamide Adenine Dinucleotide Analogues Across Multiple Protein Families: A Detailed Contact-Based Analysis of the Interaction Between Proteins and NAD(P) Cofactors. J. Med. Chem. 2013, 56, 6175−89.

influenced not only by the properties of the compound but also greatly by the tendency of the given class of protein to cobind with metal ions. The statistical data presented in this study adds to the in-depth understanding of how phosphorus compounds are bound by proteins, which is of considerable interest if innovative ways of using biotechnology for phosphorus recovery are to be exploited.



ASSOCIATED CONTENT



AUTHOR INFORMATION

S Supporting Information *

Bar graphs with first shell and second shell residue and interaction distance distributions, illustration of phosphate binding site with cobound cation, bar graphs showing first shell residue distributions in the presence and absence of cations and for enzymes and nonenzymes. This material is available free of charge via the Internet at http://pubs.acs.org. Corresponding Author

*C. Hélix-Nielsen. Tel: +45 60681081, E-mail: claus.helix. [email protected]. Author Contributions

The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript. Funding

This work was supported by the Danish Agency for Science via a grant to the innovation consortium “Natural Ingredients and New Energy”. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS P.J.G. was supported by Carlsbergfondet and Familien Hede Nielsen Fond and Fru Vera Hansens Fond. M.F.G. and C.H.N. wish to acknowledge the support for this work through the Innovation Consortium Natural Ingredients and Green Energy (NIGE), with sustainable purification technologies financially supported by Danish Agency for Science Technology and Innovation.



REFERENCES

(1) Elser, J. J. Phosphorus: a Limiting Nutrient for Humanity? Curr. Opin. Biotechnol. 2012, 23, 833−838. (2) Blank, L. M. The Cell and P: From Cellular Function to Biotechnological Application. Curr. Opin. Biotechnol. 2012, 23, 846− 851. (3) Hirsch, A. K. H.; Fischer, F. R.; Diederich, F. Phosphate Recognition in Structural Biology. Angew. Chem., Int. Ed. Engl. 2007, 46, 338−52. (4) Vipond, I. B.; Moon, B. J.; Halford, S. E. An Isoleucine to Leucine Mutation That Switches the Cofactor Requirement of the EcoRV Restriction Endonuclease from Magnesium to Manganese. Biochemistry 1996, 35, 1712−21. (5) Levy, R.; Sobolev, V.; Edelman, M. First and Second Shell Metal Binding Residues in Human Proteins Are Disproportionately Associated with Disease-Related SNPs. Hum. Mutat. 2011, 32, 1309−18. (6) Maynard, A. T.; Covell, D. G. Reactivity of Zinc Finger Cores: Analysis of Protein Packing and Electrostatic Screening. J. Am. Chem. Soc. 2001, 123, 1047−58. (7) Dudev, T.; Lim, C. Factors Governing the Protonation State of Cysteines in Proteins: An Ab initio/CDM Study. J. Am. Chem. Soc. 2002, 124, 6759−66. 1214

dx.doi.org/10.1021/jp408689x | J. Phys. Chem. B 2014, 118, 1207−1215

The Journal of Physical Chemistry B

Article

(29) Ledvina, P. S.; Yao, N.; Choudhary, A.; Quiocho, F. A. Negative Electrostatic Surface Potential of Protein Sites Specific for Anionic Ligands. Proc. Natl. Acad. Sci. U. S. A. 1996, 93, 6786−91. (30) Vyas, N. K.; Vyas, M. N.; Quiocho, F. A. Crystal Structure of M. Tuberculosis ABC Phosphate Transport Receptor. Structure 2003, 11, 765−774.

1215

dx.doi.org/10.1021/jp408689x | J. Phys. Chem. B 2014, 118, 1207−1215