Information • Textbooks • Media • Resources
Computer Bulletin Board
A Computer-Assisted Tutorial on Protein Structure
W
C. Stan Tsai Department of Chemistry and Institute of Biochemistry, Carleton University, Ottawa, ON K1S 5B6, Canada;
[email protected] The three-dimensional structure of a protein is critical to its function in biological systems. The availability of an increasing number of protein structures has greatly facilitated the teaching of protein chemistry. All biochemistry textbooks display selected 3-D illustrations of protein structures. The structural data of proteins and other biomacromolecules are maintained by the Protein Data Bank (PDB) (1), which can be accessed from http://www.rcsb.org/pdb or other mirror sites such as Entrez, http://www.ncbi.nlm.nih.gov/Entrez. This report describes classroom applications of a freeware program, WPDB (2, 3), that compresses the structure files of the PDB into a set of indexed files that can be retrieved, manipulated, and analyzed locally. WPDB is a portable nonproprietary system that can be downloaded from http://www.sdsc.edu/pb/wpdb. Its CD (version 2.2) containing 6097 structural entries can be requested by sending email to
[email protected]. WPDB is useful for preparing tutorials, slide shows, class material (lecture notes and Web displays), and assignments. By making available diskettes with subsets of the structural database of proteins grouped together according to lecture topics for students to use as selfpaced tutorials, WPDB offers an effective way to teach students the structure and function of proteins.
manipulation and analysis of the selected molecules. The compound option provides the database information, amino acid sequences, and secondary structure assignments (Fig. 1). The profile option illustrates profiles of various residual properties such as volume, polarity, hydrophobicity, and mean exposure (Fig. 2). The align option performs sequence alignment of the selected entities (Fig. 2). The 3D option displays their 3-D structures for identification, manipulation, and superposition (Fig. 3). The contact map option examines residual or atomic contacts between two entities and is very useful for tracing contact residues in protein–ligand interactions. The geometry option analyzes bond length, bond angle, and dihedral angle, which yields the conformational map (Fig. 3). Building Database with WPDBL The WPDBL provides capability to build a database from a specific subset of structures found in the WPDB or structures from other sources (in the PDB format) for use by
How WPDB Works WPDB is a Microsoft Windows–based program to interpret the 3-D structures of biomacromolecules found in the Protein Data Bank. The program provides query ability to find structures and for data management and analysis. It performs sequence alignment, property-profile plot, secondary structure assignment, contact analysis, geometric calculation, 3-D rendering, and structure superposition. The 3-D structures can be displayed within the program or by invoking RasMol (4 ). The accompanying WPDB loader (WPDBL) permits the user to build a database from selected structural files in the PDB format for use by WPDB. With WPDB, users have the convenience of accessing the entire contents of the PDB database locally. The query menu enables users to search the entire database according to file I.D. or compound name from the query dialog box. The search returns a list of structure files from which the target protein molecules are selected for viewing (Fig. 1). The tools pull-down menu provides options for
Figure 1. Query, selection of view structure, and secondary structure assignment. The query dialog box is invoked by choosing the New Query option from the Query menu of the main window. A search of the database for alcohol dehydrogenase returns 25 hits, from which 5ADH is selected as the view structure. Choosing the Compound option from the Tools menu activates the compound dialog box. The amino acid sequence and its secondary structure assignment are given by highlighting 5ADH under Entity and clicking Info in the compound dialog box.
JChemEd.chem.wisc.edu • Vol. 78 No. 6 June 2001 • Journal of Chemical Education
837
Information • Textbooks • Media • Resources
WPDB. To facilitate a query by protein name, the file (pdb*.ent) should include a COMPND line giving the compound name, which is used to search the database. The program produces a set of compressed indexed files from the original PDB files. Thus the protein structure files can be grouped together according to lecture topics into a set of indexed files to be used for tutorials. Tutorials Figure 2. Sequence alignment and residual property profile. For the homology analysis of Table 1 lists the tutorial topics lactate dehydrogenase isozymes, 3LDH (M4 isozyme) and 5LDH (H4 isozyme) are selected to and protein databases inscribed in the view main window. The align dialog box is activated by choosing the Align option from the four diskettes. These tutorials are the Tools menu. A preliminary alignment with 43% homology is shown by selecting both W available online. Each diskette con3LDH and 5LDH from the Entity entry of the Edit submenu in the align dialog box. Clicking tains readme.txt and one tutorial dithe Align entry of the Edit submenu records 62% homology in the final sequence alignment. rectory consisting of 20 indexed files The profile of residual properties is visualized by choosing the Profile option from the Tools (B00.bps to B19.bps), wpdb.exe and menu. The residual property is assigned from the Property entry of the Mode submenu and raswin.exe. In addition to reading the the number of the profiles is determined by choosing either the Show entity entry (one profile of the highlighted structure) or the Show alignment entry (comparison of profiles of two highhandout general description on “How lighted structures). WPDB Works”, the student is urged to read the text file, readme?.txt, which gives specific instructions for the tutorial on that diskette. The structural hierarchy of proteins is depicted in the first tutorial, which familiarizes the student with molecular graphics via applications of the 3D option and RasMol. The conformations and structural domains of proteins are illustrated by various receptors (receptor fragments). The structural domains (i.e., all-α, all-β, α/ β, and α+β) are exemplified. The student is reminded to compare conformational maps (Phi/Psi plots) from the geometry option for the receptors with varied structural domains. The subunit structure is represented with oligomeric insulin molecules. An interesting example of an all- α human growth hormone complex with its Figure 3. Three-dimensional illustration of the enzyme–substrate complex and its conformaall- β receptor is also included. tional map. The 3-D view window is invoked by choosing the 3D option from the Tools menu. Structural changes accompanying Each entry from the Edit submenu of the view window provides a dialog box. The 3-D rendering of the structure is selected from the entity dialog box by choosing the Entity entry. To enhance the activation of chymotrypsinogen the superposition of two structures in different color, click the color code (bar) of the view to α -chymotrypsin and γ -chymotwindow before highlighting the structure in the entity dialog box. Similarly, click the color code rypsin are presented. before selecting the substractures/residues in the select residues dialog box in order to illustrate In the second tutorial, the highspecific substractures/residues by different color. The example (1LZS) shows the cleavage of sequence homology of human chitohexaose substrate into chitotetraose and chitobiose by human lysozyme at the active site. alcohol dehydrogenase (ADH) βThe conformation map of the view structure can be obtained by clicking the options; Tools, isozymes is demonstrated. By Geometry, Mode, Dihedral, and Phi/Psi in sequence. contrast, the structural diversities of homologous enzymes such as the by the ADH–coenzyme complexes, which can be viewed by lysozymes from different biological origins catalyzing the the 3D option and studied with the contact map option. identical hydrolytic reaction are compared. The sequence In the third tutorial, the mechanistic information for γhomology (align option) and the secondary structure chymotrypsin, human lysozyme, and papain can be inferred assignment (compound option) are major themes of this from enzyme complexes with substrate analogs, products, and tutorial. The formation of enzyme complexes is exemplified
838
Journal of Chemical Education • Vol. 78 No. 6 June 2001 • JChemEd.chem.wisc.edu
Information • Textbooks • Media • Resources Table 1. Tutorial Topics, Proteins, and Databases Topic
Protein
PDB File Description
—Tutorial No. 1— Structural hierarchy, domains
Receptor
1GDC 1GLU 1IRK 1HRA 1LUT 1XUM 1XUN 2ASR
Glucocorticoid receptor Glucocorticoid receptor DNA Insulin receptor Retinoic acid receptor Lutropin receptor fragment Thyrotropin receptor Follitropin receptor Aspartate receptor
Hormone
1HGU 3HHR
Human growth hormone Growth hormone receptor
Polymerization, subunits
Insulin
9INS 4INS 1WAV
Insulin dimer Insulin tetramer Insulin dodecamer
Zymogen activation
Chymotrypsin
1CHG 2GCH 4CHA
Chymotrypsinogen A γ-Chymotrypsin A α-Chymotrypsin A
Homologous enzymes
Lysozymes
1GBS 1LYD 1LYZ 1LZY 2IHL
Australian swan EW lysozyme T4 phage lysozyme Hen EW lysozyme Turkey EW lysozyme Japanese quail EW lysozyme
Isozymes
Alcohol DH
1HDX 1HDY 1HTB
Human β1 isozyme Human β2 isozyme Human β3 isozyme
Enzyme complexes
Alcohol DH
5ADH 6ADH 8ADH 2OHX
Horse ADH–ADP–ribose Horse ADH–NAD Apo- horse ADH Horse ADH–NADH
Reaction mechanism inferred
γ-Chymotrypsin 1GMC 2GMT 8GCH
γ-Chymotrypsin Alkylated with AAF-Et-ketone Complex with GAY
Lysozyme
1LZ1 1LZR 1LZS
Human lysozyme Complex with (NAG)4 Complex with oligo-NAG
Papain
1PAD 1PIP 1POP 9PAP
AAFMeA–Cys25 derivative QVVA–anilide complex LLR complex Cys25 oxidized enzyme
Carbonic AH
1CA2 1RZA 1RZC 1RZD 1RZE
Carbonic anhydrase, native Zn Carbonic anhydrase, Co Carbonic anhydrase, Cu Carbonic anhydrase, Mn Carbonic anhydrase, Ni
—Tutorial No. 2—
active-site modifications. This tutorial emphasizes the application of the contact map option to identify active site residues and contact residues in protein–ligand interactions. Carbonic anhydrase is selected to show a metal coordination site and the structural effect of metal–ion replacement in the metalloenzyme. The fourth tutorial illustrates structural attributes with regard to enzyme stereospecificity and enzyme allosterism. The former is represented by D- versus L-lactate dehydrogenase and the latter by the R versus T states of aspartate transcarbamylase. In this tutorial, the molecular superposition of paired structures in the 3D view window is complemented by the identification of contact residues. These allow the drawing of conclusions with respect to structural differences between the D- and L-dehydrogenase complexes, or the structural changes that accompany the allosteric transitions of the two states of aspartate transcarbamylase. Students are able to sign out the tutorial diskette that contains one or more groups of protein structures corresponding to lecture (tutorial) topics for 3-D viewing, manipulation, and analysis with WPDB. This self-paced tutorial using WPDB is a very effective way for students to comprehend protein structures. An earlier version of the tutorial was used in the general biochemistry class and the current version is used in the protein chemistry and bioinformatics classes with favorable responses. Students report that the tutorial is most helpful in improving their perception and understanding of protein structures.
—Tutorial No. 3—
Metalloenzyme
—Tutorial No. 4— Stereospecificity Lactate DH
Allosterism
1LDB 1LDX 1DLD 1LDN
D-Lactate
dehydrogenase dehydrogenase D-LDH–NAD–pyruvate L-LDH–NAD–oxamate–FDP
Lactate DH
1LTH
1:1 mixture, R- and T-LDH
AspTC
2ATC 1AT1 3AT1 4AT1 7AT1
Apo-Asp. transcarbamylase R state with PAA, MA T state with PAA R state with ATP T state with ATP, PAA, MA
L-Lactate
Abbreviations: ADH, alcohol dehydrogenase; AH, anhydrase; DH, dehydrogenase; EW, egg white; FDP, D-fructose-1,6-bisphosphate; LDH, lactate dehydrogenase; NAG, N-acetylglucosamine; MA, malonate; PAA, phosphonoacetamide; TC, transcarbamylase; and single letters for amino acids in the peptides.
Acknowledgments I thank I. N. Shindyalov and P. E. Bourne for developing WPDB and making the program available for free distribution. The administrative office of the San Diego Supercomputer Center is acknowledged for providing a CD of the program. The contributions of numerous investigators who elucidated the 3-D structures of proteins and deposited their PDBs used in the preparation of these tutorials are greatly appreciated. W
Supplemental Material The tutorials are available in this issue of JCE Online.
Literature Cited 1. Bernstein, F. C.; Koetzle, T. F.; Williams, G. J. B.; Meyer, E. F. Jr.; Brice, M. D.; Rogers, J. R.; Kennard, O.; Shimanouchi, T.; Tasumi, M. J. Mol. Biol. 1997, 112, 535. 2. Shindyalov, I. N.; Bourne, P. E. J. Appl. Crystallogr. 1995, 28, 847. 3. Shindyalov, I. N.; Bourne, P. E. CABIOS 1997, 13, 487. 4. Sayle, R. A.; Milner-White, E. J. Trends Biochem. Sci. 1995, 20, 374.
JChemEd.chem.wisc.edu • Vol. 78 No. 6 June 2001 • Journal of Chemical Education
839