Protein Promoting Vibrations in Enzyme CatalysissA Conserved Evolutionary Motif Joshua S. Mincer† and Steven D. Schwartz*,†,‡,§ Department of Physiology and Biophysics, and Department of Biochemistry, and Seaver Foundation Center for Bioinformatics and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, New York 10461 Received December 13, 2002
A computational method to identify residues important in creating a protein promoting vibration (PPV) in enzymes was previously developed and applied to horse liver alcohol dehydrogenase (HLADH), resulting in the identification of eight important residues. From these residues, we define a sequence motif, the PPV generating sequence, and find it to be unique and general to a larger group of alcohol dehydrogenases from diverse sources, demonstrating that nature has selected for the PPV generating sequence. Keywords: alcohol dehydrogenase • enzyme catalysis • protein dynamics • quantum tunneling • promoting vibration • sequence homology • multiple alignment • computational algorithms • molecular dynamics • protein design
I. Introduction For more than a decade, an ever-increasing body of evidence has suggested the importance of protein dynamics in enzyme catalysis. Departing from the traditional view, first espoused by Pauling, that binding of substrate by the enzyme lowers the activation barrier, more recent views have stressed the importance of protein motion. These views are not mutually exclusive, and no doubt an enzyme may utilize one or more mechanisms. Our group has been working on such a dynamical view of catalysis, termed the protein promoting vibration (PPV) theory, which recognizes that thermal motions of specific amino acid residues may drive reactants toward each other, creating an oscillation (the promoting vibration) which modulates not only the height of the activation barrier, but also its width.1 This view has proven especially useful in understanding data from studies of enzymatic reactions involving transfer of a hydrogen (atom, proton, or hydride). In a variety of enzymes, these reactions have been shown to occur by quantum mechanical tunneling.2-4 Because the tunneling rate increases exponentially as the distance between hydrogen donor and acceptor is decreased, a PPV, which modulates this distance, would be a very efficient mechanism of catalysis. Recently, our group has developed computational algorithms to study the PPV in real enzyme systems, using classical molecular dynamics simulations of enzymes. The first algorithm5 identifies if a PPV exists and further characterizes its frequency (that is, the frequency of the donor-acceptor oscillation) and its strength of coupling to the reaction coordinate. The second algorithm,6 using the knowledge of the PPV †
Department of Physiology and Biophysics, Albert Einstein College of Medicine. ‡ Department of Biochemistry, Albert Einstein College of Medicine. § Seaver Foundation Center for Bioinformatics and Computational Biology, Albert Einstein College of Medicine. 10.1021/pr025590+ CCC: $25.00
2003 American Chemical Society
frequency, identifies the protein residues whose motions are important in creating the PPV. We have applied these algorithms to study a hydride transfer reaction in horse liver alcohol dehydrogenase (HLADH).6,7 Specifically, a hydride is transferred from the substrate, benzyl alcohol, to the nicotinamide adenine dinucleotide (NAD+) cofactor to give an aldehyde and NADH.8-13 The first algorithm identified a PPV which modulates the distance between the C7 carbon of the substrate benzyl alcohol (the hydride donor) and the C4N carbon of the nicotinamide ring of NAD+ (the hydride acceptor). The second algorithm identified 8 residues (out of a total of 374 in the A subunit) as being important to creating the PPV. Experimental studies by Klinman and co-workers and by Plapp and co-workers17-21 have probed various residues in HLADH to ascertain whether these residues are important for catalysis. Specifically, residues were mutated and the effects on catalytic rate and hydride tunneling parameters were measured. We compared our results for HLADH using our residue identification algorithm with this experimental data and found good agreement.6 Residues which are important in creating the PPV are therefore important to the enzyme’s catalytic function. As such, these residues, and their relative positions in the amino acid sequence, define a sequence pattern which one would expect to find conserved throughout biology in enzymes of similar function. In what follows, we uncover an example of this sequence conservation, starting from our results for HLADH.
II. Theoretical Background Identification of Residues Important in Creating the PPV. The method developed to identify residues important in creating the PPV is detailed in ref 6. The following is a summary. Using the crystal structure of the enzyme and bound substrates/cofactors, one carries out a classical molecular Journal of Proteome Research 2003, 2, 437-439
437
Published on Web 05/08/2003
perspectives
Mincer and Schwartz Table 1. Residues Important in Creating a Protein Promoting Vibration in HLADH Ser144 Gly181 Val203 Gly204
Figure 1. Vector functions (motions) of interest, defined in the text. R: residue center of mass; D: donor; A: acceptor.
dynamics simulation. From the resulting trajectory, the time series for various vector functions are generated for each residue within 10 Å of either donor or acceptor carbon (10 Å is chosen as an arbitrary cutoff). The vector functions, illustrated in Figure 1 (taken from ref 6), are as follows: vDA, the relative velocity between hydride donor and acceptor; urDA, the unit vector connecting donor and acceptor (note that the PPV is the projection of vDA onto urDA); vR, the residue center of mass velocity (the center of mass is chosen to represent the motion of the residue); urRD, the unit vector connecting the residue center of mass and the donor (note that if the residue lies on the acceptor side of the active site, then urRA, the unit vector connecting the residue center of mass and the acceptor, is chosen). From these time series, one calculates the functions A(t) and B(t): A(t) ) v b b DA ‚ ur DA B(t) ) (v bR ‚ ur b)(ur b) RD b RD ‚ ur DA A(t) represents the PPV, whereas B(t) addresses two criteria a residue must satisfy to possibly drive the PPV motion. The first vector projection in B(t) determines to what extent the residue could drive the donor (or acceptor) motion. The second projection determines whether this driving is in the right direction to create the PPV. Note that the promoting vibration is found in the donoracceptor relative motion. Specifically, component(s) of this motion couple to the reaction coordinate to modulate the height and width of the activation barrier. Protein residues, due to their thermal motion, drive this promoting vibration (and that is why we refer to the promoting vibration as a protein promoting vibration). If a residue drives the PPV, then it’s motion should be correlated with the PPV motion, i.e., A(t) and B(t) should be correlated. To determine whether they are, we employ a time correlation function and its associated spectral decomposition.14 The time correlation function is defined as follows: CAB(τ) ) lim
Tf∞
1 2T
∫
T
-T
A(t + τ)B(t)dt
Assuming the system is ergodic, this is equivalent to the ensemble average:14 CAB(τ) ) 〈A(t + τ)B(t)〉 Previously, we found that the PPV, represented by the function A(t), is a function with one dominant frequency.7 Namely, the donor-acceptor oscillation is characterized by a dominant 438
Journal of Proteome Research • Vol. 2, No. 4, 2003
Val207 Glu267 Ile269 Val292
frequency; because it is not purely harmonic, in actuality it is a superposition of other frequencies as well. If a given residue drives the PPV, then B(t) should feature a similar dominant frequency. The Fourier transform of C(τ) gives its spectral decompostion,14 G(ω): G(ω) )
∫
∞
-∞
C(τ)e-iωτdτ
The strength of correlation between A(t) and B(t) at each frequency ω is given by the amplitude of the peak. Residues which are important in creating the PPV demonstrate a strong peak in the spectral decomposition at the dominant frequency of the PPV. A note about the 10 Å cutoff is in order. We chose to examine residues that lie at least partially within 10 Å of donor or acceptor. Our intention was to look at residues in the vicinity of the active site. It is possible that residues lying farther away may contribute to creating the PPV. The algorithm may be easily modified to investigate this possibility. Conceivably, such residues may also be conserved among enzymes of similar function. Defining the PPV Generating Sequence. The residues important in creating the PPV in HLADH are found in Table 1, based on ref 6. From these and their relative positions, one may write down the following sequence, which will be called the PPV generating sequence: S-X(36)-G-X(21)-V-G-X(2)-V-X(59)-E-X-I-X(22)-V X(n) represents a spacing of n amino acids, each of which may be any amino acid.
III. Results and Discussion The PPV generating sequence, defined above, was used to search the PIR-NREF sequence database15, which at the time of the search contained 1 042 859 protein sequences. 44 proteins were found to contain this pattern embedded within their amino acid sequences. They derived from the following organisms: Gallus gallus (chicken), Uromastyx hardwickii (Indian spiny-tailed lizard), Homo sapiens (human), Equus caballus (domestic horse), Oryctolagus cuniculus (European rabbit), Coturnix japonica (Japanese quail), Oryza sativa (rice), Zea mays (a plant), Octopus vulgaris (common octopus), Rana perezi (Perez’s frog), Struthio camelus (ostrich), Peromyscus maniculatus (deer mouse), Mus caroli (Ryukyu mouse), Mus musculus (house mouse), Rattus norvegicus (Norway rat), and Arabidopsis thaliana (mouse-ear cress). The interesting result is that despite their diverse origins, each protein shares the same function, each is an alcohol dehydrogenase. To what extent nature has conserved the PPV generating sequence is better appreciated by considering how much of sequence conservation in this set of proteins is accounted for by it alone. By aligning the 44 sequences, using the CLUSTALW algorithm,16 we find that 68 residues are unchanged between these proteins (see Figure 1S for CLUSTALW results). The average protein length in this set is 373 amino acids; thus, these
perspectives
Protein Promoting Vibrations in Enzyme Catalysis
68 residues translate roughly to a total amino acid conservation of 18%. Clearly the 8 residues which are perfectly conserved in all 44 proteins contribute a significant amount to the total sequence homology. Across such a wide range of sources, nature has applied strong selection pressure to maintain the PPV generating sequence. This is further demonstrated by slightly changing the PPV generating sequence. If, for example, one changes S-X(36)-GX(21)-V to S-X(36)-G-X(20)-V or to S-X(36)-G-X(22)-V, then no matches are found in the PIR-NREF database. We have not even changed one of the eight residues; we have simply altered the spacing between two of them (G and V) by one amino acid. The spacers in the PPV generating sequence are apparently very important, even as the identities of the residues filling that space seem not to be. This makes sense, considering that the eight important residues must be positioned correctly to drive the substrate alcohol and the NAD+ toward one another. And it further illustrates that nature has conserved the PPV generating sequence, including the length of the spacer regions.
IV. Conclusions Using computational methods recently developed, we discovered a sequence pattern important for catalysis in horse liver alcohol dehydrogenase. In the present work, we have found this pattern to be unique and general to a larger class of alcohol dehydrogenases. These enzymes, deriving from a variety of organisms, differ greatly in their amino acid sequences. Despite this, they all feature the sequence pattern found in HLADH. Apparently, nature has selected for this sequence because of its importance for catalysis by creating a protein promoting vibration. This result constitutes the first evolutionary evidence for the theory of protein promoting vibrations. An important caveat to this result must be stated. Although we have found the PPV generating sequence in a family of alcohol dehydrogenases, we have not been able to verify using our previous algorithm that these proteins couple the vibration to the reaction coordinate in the same fashion as in HLADH. To do this, we would need a crystal structure of one or more of these enzymes, and unfortunately, the only crystal structures that exist are for highly homologous proteins. When these structures become available, these computations will be completed. Our results do suggest that it is possible to identify PPV generating sequences for enzymes of various functions, using the same methods we have applied to HLADH. These sequences may in turn serve as templates for the design of functional de novo proteins.
Acknowledgment. We gratefully acknowledge the support of the Office of Naval Research and the National Science
Foundation through Grant Nos. CHE-9972864 and CHE0139752. J.S.M acknowledges fellowship support from the Medical Scientist Training Program (MSTP).
Supporting Information Available: The CLUSTALW output of the 44 aligned sequences as well as their NREF identifiers are found in Figure 1S. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Antoniou, D.; Caratzoulas, S.; Kalyanaraman, C.; Mincer, J. S.; Schwartz, S. D. Eur. J. Biochem. 2002, 269, 3103-3112. (2) Sutcliffe, M. J.; Scrutton, N. S. Eur. J. Biochem. 2002, 269, 30963102. (3) Knapp, M. J.; Klinman, J. P. Eur. J. Biochem. 2002, 269, 31133121. (4) A recent theoretical study of tunneling in methylamine dehydrogenase: Faulder, P. F.; Tresadern, G.; Chohan, K. K.; Scrutton, N. S.; Sutcliffe, M. J.; Hillier, I. H., Burton, N. A. J. Am. Chem. Soc. 2001, 123, 8604-8605. (5) Caratzoulas, S.; Schwartz, S. D. J. Chem. Phys. 2001, 114, 29102918. (6) Mincer, J. S.; Schwartz, S. D. J. Phys. Chem. B 2003, 107, 366371. (7) Caratzoulas, S.; Mincer, J. S.; Schwartz, S. D. J. Am. Chem. Soc. 2002, 124, 3270-3276. (8) Agarwal, P. K.; Webb, S. P.; Hammes-Schiffer, S. J. Am. Chem. Soc. 2000, 122, 4803-4812. (9) Webb, S. P.; Agarwal, P. K.; Hammes-Schiffer, S. J. Phys. Chem. B 2000, 104, 8884-8894. (10) Billeter, S. R.; Webb, S. P.; Agarwal, P. K.; Iordanov, T.; HammesSchiffer, S. J. Am. Chem. Soc. 2001, 123, 11 262-11 272. (11) Alhambra, C.; Corchado, J. C.; Sanchez, M. L.; Gao, J.; Truhlar, D. G. J. Am. Chem. Soc. 2000, 122, 8197-8203. (12) Alhambra, C.; Corchado, J.; Sanchez, M. L.; Garcia-Viloca, M.; Gao, J.; Truhlar, D. G. J. Phys. Chem. B 2001, 105, 11 326-11 340. (13) Truhlar, D. G.; Gao, J.; Alhambra, C.; Garcia-Viloca, M.; Corchado, J.; Sanchez, M. L.; Villa, J. Acc. Chem. Res. 2002, 35, 341-349. (14) McQuarrie, D. A. Statistical Mechanics; University Science Books: Sausalito, CA, 2000. (15) Wu, C. H.; Huang, H.; Arminski, L.; Castro-Alvear, J.; Chen, Y.; Hu, Z. Z.; Ledley, R. S.; Lewis, K. C.; Mewes, H. H.; Orcutt, B. C.; Suzek, B. E.; Tsugita, A.; Vinayaka, C. R.; Yeh, L. S.; Zhang, J.; Barker, W. C. Nucleic Acids Res. 2002, 30, 35-37. The URL for the PIR website is http://pir.georgetown.edu. (16) Thompson, J. D.; Higgins, D. G.; Gibson, T. J. Nucleic Acids Res. 1994, 22, 4673-4680. The CLUSTALW alignment was carried out through the PIR website. (17) Bahnson, B. J.; Park, D.-H.; Kim, K.; Plapp, B. V.; Klinman, J. P. Biochemistry 1993, 32, 5503-5507. (18) Bahnson, B. J.; Colby, T. D.; Chin, J. K.; Goldstein, B. M.; Klinman, J. P. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 12 797-12 802. (19) Chin, J. K.; Klinman, J. P. Biochemistry 2000, 39, 1278-1284. (20) Rubach, J. K.; Ramaswamy, S.; Plapp, B. V. Biochemistry 2001, 40, 12 686-12 694. (21) Rubach, J. K.; Plapp, B. V. Biochemistry 2003, 42(10), 2907-2915.
PR025590+
Journal of Proteome Research • Vol. 2, No. 4, 2003 439