Introduction to Protein Structure through Genetic Diseases

May 5, 2008 - with substrates and other proteins elucidates partners in activity. Determination of high-resolution protein structure is another piece ...
0 downloads 0 Views 228KB Size
In the Classroom

Introduction to Protein Structure through Genetic Diseases Tanya L. Schneider and Brian R. Linton*,† Department of Chemistry, Bowdoin College, Brunswick, ME 04011; *[email protected]

A molecular understanding of protein biochemistry is integral to an undergraduate chemical education. Protein sequence determination and analysis of mutations illustrate which components of the protein are required for native function. Thermodynamic stability and kinetic efficiency indicate how well a protein folds and functions. Measurement of the strength of interactions with substrates and other proteins elucidates partners in activity. Determination of high-resolution protein structure is another piece of the puzzle, allowing us to use a computer interface to visualize protein folding patterns as well as specific atomic interactions with other molecules. High-resolution structures themselves are not sufficient to explain protein activity, but when students can envision the structural elements and chemically active residues comprising a protein, they gain a more complete understanding of the molecular events responsible for protein function. With increased availability of protein structures and free online software, it is now straightforward to include such a study in the classroom or laboratory. While there are many ways to engage students in the structural analysis of proteins, one option is to focus on an exploration of genetic diseases. Any protein could be used in theory, but studying genetic diseases has multiple benefits over previously employed examples. Many students are quite interested in general topics relating to medicine, but they are less comfortable initially addressing the molecules involved. Linking student interest in well-known genetic conditions with a molecular understanding of protein structure provides an assignment that maximizes student engagement, while strengthening a molecular understanding of protein biochemistry. Additional benefits of this approach include the choice of high-resolution structures that illustrate higher order quaternary structures such as the coiled-coil, Greek-key, and TIM barrel, and the selection of genetic conditions that will promote a review of the corresponding metabolic transformations. The Assignment The primary goal of this approach was to empower the students to utilize the protein databank and to manipulate high-resolution protein structures using DeepView (1). This was done early in the semester, during an advanced seminar on medicinal chemistry, where a variety of medical events are explored but each effect must be addressed at the molecular level. A week before the genetic diseases were discussed, students were given a handout that contained the necessary details for the discussion. This included a list of genetic diseases from which students could choose a disease of interest, as well as descriptions of both the Protein Data Bank (2) and DeepView. There was a short, in-class discussion and demonstration of †Current address: Department of Chemistry, College of the Holy Cross, Worcester, MA 01610.

662

the RCSB Web site and a sample protein in DeepView. In the ensuing week, each student was responsible for exploring the protein responsible for an individual condition, preparing a short handout for the class, a longer written submission for evaluation, and ultimately making an oral presentation of roughly ten minutes to the entire class, typically using PowerPoint and incorporating DeepView. Each presentation began with a brief introduction to the genetic disease and relevant metabolic transformation or cellular interaction that is affected. Students were asked to point out structural features of their protein, such as interesting secondary, tertiary, or quaternary structures or the presence of cofactors, metals, or substrates in their structure. Then they were asked to focus on the particular aspects of the structure that helped to explain or clarify the observed phenotype or disease state: What were the common mutations and what biochemical techniques were used to elucidate these results? Did these mutations disrupt active site interactions with substrates, protein folding, or interactions between proteins? Were there therapies that could be used to combat a genetic disease, and how did they relate to the protein structure? All of these topics could be addressed from the single literature article the students were given and the protein structure they were exploring. Students could easily generate pictures of proteins and indicate where the common mutations occur. These pictorial representations clearly indicated the locations of known mutations, and many articles contained analyses of the effect that mutations might have upon protein activity. This invariably led to discussions between the presenter and the audience as to how mutations change protein structure and ultimately cause disease. In all cases, real-time manipulation of DeepView files was required in the presentations in addition to any static pictures. It is quite easy for the presenter to find static pictures that exist on various Web sites, but they learn more about the proteins by delving into the three-dimensional representations. The audience was given handouts that contained static pictures, but the availability of three-dimensional representations made it easier consider the entire protein and to answer questions by locating the exact residue in question and using the software to zoom in. Students were encouraged to find additional articles to provide background for medical conditions or additional mutations, and many drew upon the literature to support their explorations. After the students had become comfortable with high-resolution protein structures at the beginning of the course, they were prepared to be more independent in their exploration of protein structures that include interactions of drugs and their protein targets. While this approach has proved successful in this introductory scope, an instructor with deeper expectations could assign students only a specific genetic condition and the students could be responsible for gathering articles from the literature and choosing structural files that best illustrate the connection of protein structure and genetic disease.

Journal of Chemical Education  •  Vol. 85  No. 5  May 2008  •  www.JCE.DivCHED.org  •  © Division of Chemical Education 

In the Classroom

Accessing High-Resolution Structures Any exploration must begin with an understanding of the Protein Data Bank structural files and the requisite software. High-resolution protein structures are easily accessed at a Web site maintained by the Research Collaboration for Structural Bioinformatics (2) and can be downloaded to any computer platform. The database can be searched by protein name or by keyword, providing access to specific enzymes of interest, or through the names of diseases. Each protein structure is given a unique four-digit code, and the Web site also contains helpful references to the articles that document how each structure was determined. There are many platforms from which to explore protein structure, ranging from free widely-available software to costly commercial packages. For this assignment, students were directed to use DeepView (also know as Swiss PDB Viewer), which can be downloaded free of charge over the Internet (1). A discussion of the DeepView software can be found in a previous article (3), and if more information is desired an online tutorial exists that students have found quite useful (4). In addition to its low cost and ease of use, DeepView has a control panel window that can be used to control the presentation of each amino acid, including biological ribbon and space-filling representations, permitting the student to explore visually the role of single amino acids in the larger context of the protein. This awareness of how individual amino acids may be responsible for a variety of medical conditions is crucial to understanding genetic diseases. Typically it is useful to display the complete protein in a ribbon format to most clearly illustrate the overall protein structure. Individual amino acids of interest or bound substrates can be highlighted using a wireframe or spacefilling format in user-defined colors. Genetic Diseases Among the multitude of diseases that result from genetic mutations, only certain examples lend themselves to structural interpretation. Such analysis requires both a detailed understanding of the protein responsible for a condition as well as a high-resolution protein structure through X-ray crystallography or NMR methods. Unfortunately, even when the protein of interest is known, crystallization or crystal resolution may not be possible or complexity may preclude successful use of NMR. Fourteen examples are listed in Table 1 for which both high-resolution protein structures and analysis of mutants are well-characterized. They have been chosen to be a sampling of medically-relevant conditions and also to highlight various aspects of protein structure. Residues implicated in disease are noted in each cited article even when mutated protein structures are not reported. While a complete discussion of the genetic conditions is beyond the scope of this article, features and important molecular lessons of a single case study, phenylketonuria, will be discussed below, while the others are addressed briefly in the online supplement. Phenylketonuria (PKU) is an autosomal recessive metabolic disorder affecting amino acid processing owing to mutations in the enzyme phenylalanine hydroxylase (PAH). PAH

Table 1. Genetic Diseases Explored in this Study Along with Their PDB Reference Codes Genetic Disease

PDB Code

Reference

Cystic fibrosis

1R0X

 5

Fructose intolerence

1XDL

 6

Galactosemia

1I3K

 7

Gaucher’s disease

1Y7V

 8

Glutathione synthetase deficiency

2HGS

 9

Hemophilia

1D7P

10

Hereditary elliptocytosis

2SPC

11

1N18, 1N19

12

Maple syrup urine disease

1DTW

13

Methylmalonic aciduria

1NOG

14

Phenylketonuria

2PAH

15

Porphyria

1URO

16

Sickle-cell anemia

2HBS

17

Tay–Sachs disease

1NOU

18

Lou Gehrig’s disease

catalyzes the conversion of l-phenylalanine to l-tyrosine and is the main catabolic mechanism for phenylalanine removal. Mutations in PAH result in reduced or complete inability to process phenylalanine, leading to a toxic buildup of phenylalanine and ultimately varying degrees of mental retardation in untreated patients. The example chosen for phenylketonuria (15) uses a truncated wild-type structure of phenylalanine hydroxylase to map the location of known mutations, particularly those implicated in the disruption of oligomerization domains. Crystallization of the full-length protein has been difficult, but diffraction-quality crystals were obtained by proteolytic removal of the N-terminal regulatory domain, leaving both an active catalytic domain and tetramerization domain. At this point a complete picture of the full-length protein is only possible through a composite model that takes the regulatory or catalytic structure along with the catalytic or tetramerization structure and superimposes their catalytic domains (19). It is important for students to realize that there are limitations to our ability to obtain high-resolution structures, and any differences with wild-type protein must be considered as part of their analysis. Figure 1 shows a sample analysis of a truncated phenylalanine hydroxylase, highlighting the active site, the tetramerization interface, and relevant mutations that lead to disease. The crystal structure determination revealed a tetramer, which is thought to be the active form of the enzyme, but this structural file shows two protein subunits. Two of these symmetry-related dimers can be combined to form the complete tetramer (dimer of dimers). This is a good example of the importance of correlating structural coordinates with other biochemical data. It is always possible that oligomerization

© Division of Chemical Education  •  www.JCE.DivCHED.org  •  Vol. 85  No. 5  May 2008  •  Journal of Chemical Education

663

In the Classroom

Figure 1. The dimeric crystal unit of phenylalanine hydroxylase (15). Coordination of iron (black) by H285, H290, and E330 (green) in the enzyme catalytic domain is highlighted. The splicing mutation most common among Caucasians results in deletion of 52 C-terminal amino acids involved in oligomerization (light blue). Highly conserved R408 (dark blue) is important for connecting the oligomerization domain with the enzyme core via multiple hydrogen bonds. Additional amino acids implicated in common mutations resulting in severe PKU owing to low levels of enzyme are shown in red (R252W, A259L/T, F299C, and L311P). R413P, a frequent and severe mutation in the Asian population, is identified in pink. (This figure is available in color in the table of contents on page 597 and in the online PDF version of this article.)

states found in the crystal structure are an artifact of crystal packing rather than solution behavior, or as in this case the structure may be reported as the simplest unique unit cell, despite known oligomerization. Visual representation of the protein began by using the control panel window to turn on the ribbon display and color it gray for the entire protein, while turning off the entire wireframe of backbone and sidechains. To produce a more polished representation, the “Render in solid 3D” option found in the “View” menu was used. Residues 118–408 comprise the catalytic domain that contains thirteen alpha-helices and eight betastrands. In order to indicate the active site, the non-heme irons present in each catalytic center were displayed as van der Waals (vdw) surfaces and colored black. The iron-coordinating histidines and glutamate (H285, H290, and E330) were displayed as green wireframes. The much smaller tetramerization domain continues to the C-terminus and contains two beta strands and one alpha helix. This helix forms an easily-recognized coiled-coil quaternary structure with other protein subunits. There are over 500 known mutations that lead to phenylketonuria (19, 20). While a more detailed inquiry could focus on the complete depth of this knowledge, this example only contains the common mutations affecting protein folding that were considered in the original article. This is done in the context of a limited assignment where the main emphasis is for the student to learn how to use and interpret protein structures and in an effort to highlight a few common mutations rather than a complete review of phenylketonuria. Many mutations disrupt the interface between the catalytic and tetramerization domains and lead to a severe form of phenylketonuria. The most prevalent splicing mutation in Caucasians results in expression of a protein lacking the C-terminal

664

52 amino acids (blue ribbon). This deletion eliminates PAH activity in the cell owing to protein instability. Another indicator of the importance of the tetramerization domain is the R408W mutant, which also causes severe PKU. To highlight this residue the wireframe of both backbone and sidechain were turned on and the residue colored dark blue. This residue is located in the hinge that connects the tetramerization and catalytic domains and this mutation also results in reduced protein stability in vivo. The arginine sidechain is buried and contacts the catalytic domain, suggesting that it plays a role in stabilizing domain swapping or by anchoring the tetramerization domain in the correct orientation. The similar R408Q mutation permits protein tetramerization and leads to the less severe, non-PKU hyperphenylalaninemia. Mutations in the residues that interact with arginine-408 have also been found to lead to severe PKU. Mutations R252W, A259L/T, F299C, and L311P result in extremely reduced levels of PAH and are shown in red. In the Asian population, R413P (pink) is a common mutation, again resulting in severe PKU. In this case, it is not the lack of sidechain, but rather the introduction of a proline backbone that likely disrupts the orientation of the tetramerization domain. A nearby mutation, Y414C, results in a protein with approximately 50% activity and causes a mild form of PKU. Presumably the positioning of this residue does not drastically alter the protein folding. Existing molecular treatment of phenylketonuria relies on dietary manipulation. Adherence to an l-phenylalaninerestricted diet is one possible treatment, limiting the buildup of phenylalanine to sub-toxic levels. A recent report (21) suggests that increased intake of the tetrahydrobiopterin cofactor can lead to a normalization of phenylalanine concentrations that may aid patients with certain PKU genotypes.

Journal of Chemical Education  •  Vol. 85  No. 5  May 2008  •  www.JCE.DivCHED.org  •  © Division of Chemical Education 

In the Classroom

Conclusions Investigating genetic diseases is an ideal venue for studying the role of protein structure in complex biological processes. Students learn how to use freely available software to manipulate high-resolution protein structures in real-time, rather than relying on static representations. The downloading and use of both protein coordinate (pdb) files and visualization software is straightforward, and students can quickly begin to investigate their proteins. Once they are comfortable with the process, students are often self-motivated to explore other structures of interest in the Protein Data Bank. In comparison with previous assignments regarding protein structures, the application to genetic diseases resulted in an increased level of student interest and therefore a greater immersion in the material. The ability to examine high-resolution structures permits exploration of the individual mutations known to cause diseases. Mutations in active site residues can render an enzyme inactive, but surrounding residues can also affect the organization of the active site. Some mutations lead to protein misfolding, which can reduce enzyme activity, alter cellular trafficking of the protein on the way to its destination, or lead to protein degradation. Several mutations affect protein– protein interactions that reduce enzyme activity or weaken structures needed for cellular function. It is also important for students to realize that there are various levels of disease states caused by differing severity of the mutation. Some changes lead to only a slight diminution of enzyme efficiency and may be devoid of phenotype. Others may be so severe that they may preclude prenatal development. The mutations that lead to well-characterized diseases often result in intermediate effects. All of these aspects make this exercise useful for illustrating how our growing knowledge of molecular structure enhances our understanding of the nature of various diseases and how best to diagnose and treat those diseases. Acknowledgments The authors would like to thank the Bowdoin College students who have been a part of Chemistry 360, Molecular Medicine, for their input regarding this assignment. Literature Cited 1. (a) DeepView–Swiss PDB Home Page. Viewer http://www. expasy.org/spdbv (accessed Jan 2008). (b) Guex, N.; Peitsch, M. C. Electrophoresis 1997, 18, 2714–2723. 2. (a) RCSB Protein Data Bank. http://www.rcsb.org (accessed Jan 2008). (b) Berman H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I, N.; Bourne, P. E. Nuc. Acids Res. 2000, 28, 235–242. 3. Ship, N. J.; Zamble, D. B. J. Chem. Educ. 2005, 82, 1805–1808. 4. Deepview Tutorial. http://www.usm.maine.edu/~rhodes/SPVTut/ (accessed Jan 2008).

5. Lewis, H. A.; Buchanan, S. G.; Burley, S. K.; Conners, K.; Dickey, M.; Dorwart, M.; Fowler, R.; Gao, X.; Guggino, W. B.; Hendrickson, W. A.; Hunt, J. F.; Kearins, M. C.; Lorimer, D.; Maloney, P. C.; Post, K. W.; Rajashankar, K. R.; Rutter, M. E.; Sauder, J. M.; Shriver, S.; Thibodeau, P. H.; Thomas, P. J.; Zhang, M.; Zhao, X.; Emtage, S. EMBO J. 2004, 23, 282–293. 6. Malay, A. D.; Allen, K. N.; Tolan, D. R. J. Mol. Biol. 2005, 347, 135–144. 7. Thoden, J. B.; Wohlers, T. M.; Fridovich-Keil, J. L.; Holden, H. M. J. Biol. Chem. 2001, 276, 20617–20623. 8. Dvir, H.; Harel, M.; McCarthy, A. A.; Toker, L.; Silman, I.; Futerman, A. H.; Sussman, J. L. EMBO Rep. 2003, 4, 704–709. 9. Polekhina, G.; Board, P. G.; Gali, R. R.; Rossjohn, J.; Parker, M. W. EMBO J. 1999, 18, 3204–3213. 10. Pratt, K. P.; Shen, B. W.; Takeshima, K.; Davie, E. W.; Fujikawa, K.; Stoddard, B. L. Nature 1999, 402, 439–442. 11. Yan, Y.; Winograd, E.; Viel, A.; Cronin, T.; Harrison, S. C.; Branton, D. Science 1993, 262, 2027–2030. 12. Cardoso, R. M. F.; Thayer, M. M.; DiDonato, M.; Lo, T. P.; Bruns, C. K.; Getzoff, E. D.; Tainer, J. A. J. Mol. Biol. 2002, 324, 247–256. 13. Ævarsson, A.; Chuang, J. L.; Wynn, R. M.; Turley, S.; Chuang, D. T.; Hol, W. G. J. Structure 2000, 8, 277–291. 14. Saridakis, V.; Yakunin, A.; Xu, X.; Anadakumar, P.; Pennycooke, M.; Gu, J.; Cheung, F.; Lew, J. M.; Sanishvili, R.; Joachimiak, A.; Arrowsmith, C. H.; Christendat, D.; Edwards, A. M. J. Biol. Chem. 2004, 279, 23646–23653. 15. Fusetti, F.; Erlandsen, H.; Flatmark, T.; Stevens, R. C. J. Biol. Chem. 1998, 273, 16962–16967. 16. Whitby, F. G.; Phillips, J. D.; Kushner, J. P.; Hill, C. P. EMBO J. 1998, 17, 2463–2471. 17. Harrington, D. J.; Adachi, K.; Royer, W. E., Jr. J. Mol. Biol. 1997, 272, 398–407. 18. Mark, B. L.; Mahuran, D. J.; Cherney, M. M.; Zhao, D.; Knapp, S.; James, M. N. G. J. Mol. Biol. 2003, 327, 1093–1109. 19. Erlandsen, H.; Stevens, R. C. Mol. Gen. Met. 1999, 68, 103–125. 20. An ever-expanding database of phenylketonuria mutations can be found at the Phenylalanine Hydroxylase Locus Knowledgebase at http://www.pahdb.mcgill.ca (accessed Jan 2008). 21. Erlander, H.; Pey, A. L.; Gámez, A.; Pérez, B.; Desviat, L. R.; Aguado, C.; Kock, R.; Surendran, S.; Tyring, S.; Matalon, R.; Scriver, C. R.; Ugarte, M.; Martínez A.; Steven, R. C. Proc. Natl. Acad. USA 2004, 101, 16903–16908.

Supporting JCE Online Material

http://www.jce.divched.org/Journal/Issues/2008/May/abs662.html Abstract and keywords Full text (PDF) Links to cited URLs and JCE articles

Color figures

Supplement A brief description of each disease is given to highlight the relevant aspects of protein structure

© Division of Chemical Education  •  www.JCE.DivCHED.org  •  Vol. 85  No. 5  May 2008  •  Journal of Chemical Education

665