Introduction to the crystallographic literature: A course for the

Department of Biophysics, University of Rochester Medical Center, Rochester, NY 14642. Molecular structures derived from X-ray diffraction ex- perimen...
0 downloads 0 Views 4MB Size
Symposium on Teaching Crystallography

Introduction to the Crystallographic Literature: A Course for the Nonspecialist Barry M. Goldstein Department of Biophysics, University of Rochester Medical Center, Rochester, NY 14642 Molecular structures derived from X-ray diffraction experiments have become a ubiquitous part of the information presented to contemporary scientists. Biochemists, pharmacologists, medicinal chemists, and other workers in a wide variety of fields now make routine use of structural results. Unfortunately, interest in crystal structures may not he accompanied by an understanding of the methods hy which such structures are obtained. nor of their limitations. Crystallographic resulti a;e most commonly presented in the Inerarure bv drau,inrs of moleculrs derived from atomic coordinates. These illustrations provide concise and accessible information to the nonsoecialist confronted with a structure paper. Nevertheless, such figures provide little information as to how the structures they represent were obtained. For example, an illustration alone may not allow the casual reader to distinguish between structures derived from crystallographic experiments, modeling efforts, or a comhination of techniques. Modern computer graphics have filled the literature with soohisticated molecular representations that are compelling to any reader. This is coupled with a perception among noncrystallographers that the details of a crystallographic experiment are technically inaccessihle. The combination of these factors tempts the uninitiated to approach a structure paper with the attitude that "seeing is helievine". A lack of familiarity with the experimental methodilogy may lead t o fanciful i n t e ~ ~ r e t a t i b nofs the structure that go far beyond any intended by the manuscript's authors. An example comes to mind of a student's interest in a-carbon bond-length variations in a series of unrefined protein structures. his attempt to extract nonexistent structural information arose from a lack of understanding of crystallographic methodology and its limitations. The attitudes described above are those encountered in a basic science department of an academic medical center. However.. thev , are orohahlv not limited to this settina. An attempt toaddress ihese problems has motivated the ievelooment of a course entitled "Introduction to the Crystallographic Literature". Course Goals and Organization The purpose of this course is to allow the nonspecialist to access the structure literature intelligently. The student is trained to evaluate critically the same issues one would address in any published scientific material, specifically: I. The data A. How was it obtained? B. Haw reliable is it? C. How is it presented? i.e. 1. What do I know that I did not before? 2. What is missine or de~ositedelsewhere? 11. The interpretation of tce dats A. Distinguishing the data from the model (particularlyin macromolecular structures) B, Assessment of the model 1. Is it a reasonable development from the data? 2. Is it testable? If so, how? If not, why am I reading this paper? 508

Journal of Chemical Education

The two-credit semester-length course meets for one 2hour session oer week. Both conventional and seminar formats are used. Approximarely halt' of rhesessiuns ~abouf11 hours of class timei wnsisr of regular didactic lectures by the instructor. Following this, from the crystallographic literature are presented by each student in turn and discussed in terms of the criteria listed above. The papers are in general chosen by the instructor to illustrate specific methodologies and to provide a broad survey of the literature. However, these are tailored to specific interests of the group. Examples will be given below. Requirements for the course are one semester of basic hiochemistrv and two semesters of calculus. This makes the course accessible to most second-semrster and ahove graduate students in oharmacolorv, -.. hiochemisrrv, and biophvs. . ics-the intended audience. Space limitations preclude a complete description of the material covered in the didactic sessions. An outline of this material is included in the appendix. I t should he emphasized that this is not a course in crystallography per se. Many of the subjects that are only touched upon could serve by themselves as topics for full-semester courses. Mathematics is kept to a minimum. With some exceptions, the meanings of key equations are explained with little attempt at derivation. As the titlg implies, the material is introductorv. rouehlv a t or below the level of Cantor and ~ c h i m m e l ' s - ~ i o ~ i ~ Chemistry.' s&l This as well as standard crvstalloeraohic texts are keot on reserve. However, teaching is prima;ily from instruct&'s notes, making liberal use of handouts and colored overhead slides. A sample slide (see figure) illustrates hoth the level and spirit of the didactic material. There is no laboratory associated with the course, although an occasional precession photograph or Fourier map is interpreted. The course at present is graded on a pass-fail basis. These are shameless strategems designed to lure a population of students who would otherwise never cansent to formal exoosure to this material. Nevertheless, a certain foundation of crystalloe r a ~ h i cmaterial must be laid if the stated goals are to be achieved. Semlnar Session Examples No attempt is made to provide a complete list of papers covered during the course. Any such list is hound to depend upon the interests of the group, the prejudices of the instructor, and progress in the field. Examples of manuscripts used in three recent seminar sessions are given below. The references for each session are listed, followed by a description of the contents. Papers (and occasionally other media) of hoth historical and contemporary interest are used. A rationale for each set of choices is described. -

Basedona talk givenat the Symposium on Crystallographic Education. The American Crystallographic Association Annual Meeting, Austin. Texas. March 17, 1987. 'Cantor. C. R.; Schimmel, P. R. Biophysical Chemistry; W. H. Freeman: San Francisco, 1980; Vol. 2, Chapter 13.

To Get The Answer, You Need The Answer

natom

u

I

'

initial guess me

Sample slide employed duringthe didactic section of course. The slide introduces the two-dimensional Fourier synthesis, p(x, Y)and illustratesthe phase problem in this context The "answer" refers to an interpretable electrondensity map, from which atomic coordinates are derived. However, in order to obtain this map, one must know something of the phases, hence of the coordinates themselves. The problem is solved via an initial "guess", which initiates a "bootstrap" cycle. Subsequent sessions deal with methods of obtaining this guess, 1.e.. with structure solution. Highlighted symbols represent unknowns. Xpnd Y,are coordinates of the jth atom. FW is the structure factor associated with the hM) reflection. is the sbuctura factor amplitude and am its phase.

IF*^

Example /:The Reciprocal Lattice and the Double Helix 1. Crystallization and Preliminary X-ray Study of a Complex hetween d(ATGCAT)and Actinomycin D. Takusagawa, F.; Goldstein,B. M.; Youngster, S.; Jones, R. A.;Berman, H. M. J. Biol. Chern. 1984,259,4714-4715. 2. Watson, J. D. The Double Helix: A Personal Account of the

Diseouery of the Structure of DNA; Atheneum: New York, 1968; Chapters 23 and 24. 3. A Structure for Deoxyribose Nucleic Acid. Watson, J. D.; Crick, F. H. C. Nature 1953,171,737-738. 4. Crystal Structure Analysis of a Complete Turn of B-DNA. Wing, R.; Drew, H.; Takano, T.; Broka, C.; Tanaka, S.; Itakura, K.; Dickerson, R. E. Nature 1980,287,755-758. This set of papers can be covered during a seminar session following the didactic lectures on the reciprocal lattice. The very short paper by Takusagawa e t al. is covered first. The authors demonstrate that the antitumor drug Actinomycin D intercalates into a DNA hexamer on the basis of simple packing considerations, the presence of a strong meridional reflection a t 3.4 A, and a very low resolution Patterson map. The . oaper provides a aood illustration of how maximal infor. . mation can be extracted from primarily geometric data. In

A

narticular. detailed discussion of the origins of the 3.4 "base-stackingn reflection provides an e a d y conceptualized examole of the link between the reciorocal lattice and the structure. These concepts lead naturdly to those covered in the classic paper by Watson and Crick and the related chapters in The Double Helix. Following a brief discussion of fiber diffraction, the photograph of B-DNA by Rosalind Franklin2 is indexed and interpreted in light of Watson's statement: Especially important was my insistence that the meridional reflection at 3.4A wasmuehstronger than any other reflection.This could only mean that the 3.4 A-thick purine and pyrimidine bases were stacked on top of each other in a direction perpendicular to the helical axis3 At this point the students begin to wonder a t the ease with which Nobel prizes are awarded to crystallographers. Thus, Watson. J. D. The Double Helix: A Personal Account of the Discovery of the Structure of DNA; Atheneum: New York, 1968; p 168.

Watson. J. D.. footnote 2, p 175. Volume 65

Number 6 June 1986

509

the final paper by Wang et al. is presented to illustrate the 27-year gapbetween the-proposalof a structure model for BDNA and its confirmation at atomic resolution. Example 2: Resolution in Macromolecular Structures: You Can't Always Get What You Want. 1. Cytoplasmic Malate Dehydrogenase-Heavy Atom Derivatives and Low Resolution Structure. Tsemoglou, D.; Hill, E.; Banaszak, L. J. J. Mol. Biol. 1972,69,75-87. 2. Polypeptide Conformation of Cytoplasmic Malate Dehydrogenase from an Electron Density Map at 3.0 A Resolution. Hill, E.; Tsernoglou, D.; Webb, L.; Banaszak, L. J. J.Mol. Bid. 1972,72, 517-591.591. 3. Conformation of Nicotinamide Adenine DinucleotideBound to

Cytoplasmic Malate Dehydrogenase. Webb, L. E.; Hill, E. J.; Banaszak, L. J. Biochemistry 1973,12,5101-5109. 4. The Presence of a Histidine-Aspartic Acid Pair in the Active Site of 2-Hydroxyacid Debydrogenases. Birktoft, J. J.; Banaszak,L. J. J. Biol. Chem. 1983,258,472482. This series of papers provides an excellent illustration of the relationship between resolution and information content in macromolecular crystallography. Practical aspects of data collection and structure solution are also covered. The first manuscriot oresents the 5-A structure of cvtoplasmic malate dehydro&nase (MDH). This paper provides extensive details on data collection and the preparation and screening of heavy atom derivatives. The use of differencePatterson maos in heaw atom location is illustrated, as is the application of phasi refinement and the various indicators of its success. The 5-A structure itself provides only the overall dimensions of theenzyme, as well as indicating the presence of two subunits related hy approximate two-fold symmetry. In the naocr hv Hill et al. the backbone conformation of MDH is i r k e d &om a 3-A Fourier map. A section of this man is reoroduced. clearlv showine secondarv structural featurLs. he paper by ebb et al. describes results from a 2.5A ma,. focusin, on the conformation of bound cofactor. This pape; illustrates map fitting a t the level of the functional erouo. Bond rotations are oerformed in model structures kith'otherwise rigid geomet;ies. The final paper, by Birktoft and Banaszak, presents results from the partially refined 2.5-A structure. The discussion focuses on the assignment and eeometry of side chains forming the active site. A cata~ ~ t i c m e c h a n kismproposed, based i p o n the atomic-resolution structure of this region. The four manuscri~tsdescribed thus illustrate the escalation of structure interpretation with increasing resolution. They begin with a description of gross enzyme conformation and finish with the inference of a dynamic process (the catalvtic mechanism) from a static structure. These oapers formthe basis for a class discussion on the level of inierbretation appropriate to a given structure. For each manuscript, the student is encouraged toaddress "three R's": (1)What is the resolution?, (2) Was the structure refined?, and (3) What are the relevant residuals? The purpose of this session is to emphasize what information can, and more importantly, cannot be obtained from a structure a t a particular resolution and stage of refinement. Example 3. Superoxide Dismutase: Seeing Is Not Believing 1. Terms of Entrapment; distributed by Arthur J. Olsen, the Research Institute of Scripps Clinic. La Jolla, CA 92037. 2. Determination and Analysis of the 2A Structureof Copper,Zinc Superoxide Dismutase. Tainer, J. A,; Getzoff, E. D.; Beem, K. M.; Richardson,J. S.; Richardson, D. C. J. Mol. Biol. 1982.160, 181-217. -. - -. ..

The purpose of the final session is to emphasize the distinction between data and hvoothesis. The session besins with the screening of the comp&er-animated film " ~ e r & sof Entrapment", distributed by Arthur Olsen of the Research Institute of Scripps Clinic. This entertaining and informative film describes the crystal structure of copper, zinc superoxide dismutase, based on the 2-A structure by Tainer et al. The film also presents a compelling animated interpretation of the outative enzvme mechanism. based un& the electrostatic recognition between the enzymeand the superox~deradical nrooosed bv Getzoff et al. The euldance of the superoxide radick along electrostatic field-vectors to the catalvtic site is dramaticallv illustrated. Followine the screening, the papers by ~ a i n e et r al. describing thewhighresolution crvstal structure are discussed. S~ecificallv.the known structural features of the active site channel and the catalytic site itself are enumerated. The manuscript by Getzoff et al. is then presented. This paper details the electrostatic field calculations leading to the proposal that electrostatic forces orient the approach of the superoxide radical to the catalytic site, thus enhancing the rate of catalysis. Approximations involved in oerforming these calculations are evaluated. These include 'the absenEk of solvent, as well as methods used to obtain partial charges, determine a dielectric constant and calculate the field. Finally, noncrystallographic methods available to test the model are solicited. In this way, a clear distinction is made between the structural data obtained from the diffraction experiment and the ancillary techniques used to interpret the structure and develop the model. The session ends with a rescreening of the film to emphasize these distinctions. Summary "Introduction to the Crystallographic Literature" is a graduate course designed to teach the noncrystallographer how to read a structure paper critically. The course combines formal didactic instruction with student seminar nresentations of specific papers from the crystallographic lkerature. The course is directed a t individuals who are interested in the published results of crystallographic experiments but who have little knowledge of the methods employed or their limitations. The specific topics and papers covered are a function of the interests of the class and hackeround of the instructor. Although teaching methods and co&e material may vary, the goal remains the same: to train students to access and interpret the literature intelligently. Acknowledgment The author thanks the students of RBB 565 for their enthusiasm and oatience. Work suooorted in oart bv.a erant from the James P. Wilmot ~oundaGon. Appendlx-Course Outline An outline of the course material is given below. This represents the most optimistic list of topics one hopes to cover during both didactic and seminar sessions. Clearly each subject is not given equal time. The choice of topics and ratio of didactic material to literature coverage is tailored to the background and interests of the class. For example, the material on isomorphous replacement (below) has been covered by a more mathematically oriented student during a seminar session devoted to papers by Blow and Crick4 and Dickerson, Kendrew and S t ~ a n d b e r g In . ~ general, material through small molecule refinement is covered in the didactic sessions.

3. Structure and Mechanism of Capper, Zinc Superoxide Dismutase. Tainer, J. A,; Getzoff, E. D.; Richardson,J. S.; Richardson, D. C. Nature 1983,306,2841287. 4.

Electrostatic Recognition between Superoxide and Copper, Zinc Superoxide Dismutase. Getzoff, E. D.; Tainer, J. A,; Weiner, P. K.; Kollman, P. A,; Richardson, J. S.; Richardson, D. C. Nature 1983,306,287-290.

510

Journal of Chemical Education

'Blow. D. M.; Crick. F. H. C. Acta Crystallogr. 1959, 12 794-802. Dickerson. R. E.; Kendrew, J. C.: Strandberg, 8. E. Acta Cryst& logr. 1961, 14, 1188-1195.

Introduction A. "Why would we went to know the structure of molecules?" A brief pgntpourri slide shun illustrating results from rhemiral honding t o marn,molerular conformation studies. B. The goal of the diffraction experiment: coordinates from an electron-density map. Crystals A. Interatomic Forces: the glue that binds. B. Crystal Composition: small molecule vs. macromo~ecuInr -~

1. packing 2. solvent content C. Basic Concepts 1. Unit Cell 2. Asymmetric Unit 3. Symmetry Operations 4. The Bravais Lattices 5. Relationships between V, Z, and density D. Space Groups 1. Nomenclature 2. Examples from International Tables, Vol. A. The Reciprocal Lattice (R.L.) A. Bragg's Law and Miller Indices B. Geometric Construction of the R.L.: planes to points C. Properties 1. Geometric and symmetry relationships between direct and reciprocal lattices 2. The Iutensity-weighted R.L.: a. Qualitative relationship between intensity and electron density in a plane b. Systematic absences D. The Ewald Sphere 1. Geometric link between Bragg's law and the R.L. 2. "Resolution," 0 and d*. E. Geometric Data: Worked example of the determination of lattice constants and space group from precession photographs. IV. The Diffraction Experiment A. Crystal growth: the rate-limiting step 1. Sources of Material 2. Techniques: evaporation, vapor diffusion, microdialysis. B. X-rays 1. Sources: conventional, rotating anode, and synchrotron 2. Mo vs. Cu: resolution, scattering and absorption 3. Variable wavelength C. Date Collection 1. Detection Methods: film, counter, and array detec-

-.

Practical Considerations: a. Least-squares refinement of cell parameters h. Space group ambiguities c. Unique data sets d. Intensity statistics e. Time scale of the experiment: small molecule vs. macromolecular D. Data Reduction 1. Cutoffs: "unobserveds" 2. Averaging 3. Corrections: LP, decay, absorption E. Tour of the facilities The Phase Problem A. The Structure Factor (Fhkd 1. Mathematical background a. Wave properties: amplitude and phase b. Wave summation as vector addition in the complex plane e. Euler's relation 2. Geometric Derivation of F h k r a. Scattering from atoms: the scattering factor b. Generalization of Bragg's law: phase differences for atoms not at the origin c. Scattering from molecules: Fha, as the sum of waves scattered from cell contents. 3. Components of Fhat a. Amplitude: relation between IFharl and Ihar b. Phase: relation between nhar and atomic coordinates (X,, Y,,ZJ 2.

B. The Electron Density Equation (Fourier Synthesis) Exr."

Y,,Z,) from the Fourier map 1. The "Answer": (X,, 2. Variables on the right-hand side of the equation a. Known: amplitudes, IFharl b. Unknown: phases, ahar(Xi, Yi, Zi) 3. The Phase Problem: you can get . the answer only if you know the answer. VI. Structure S&tion: Small Molecule Style A. "Bootstrapping" 1. The Fourier map computed with measured amplitudes and phases obtained from a partial structure (the "initial guess"). 2. "How good does the guess have to be?": the heavy atom advantage. B. Obtaining an Initial Guess 1. Patterson Methods a. Patterson function explained b. Harker sections e. Worked example of heavy atom location. 2. Direct Methods a. Normalized structure factors b. Sign relationships e. MULTAN C. ~ h e b i f f e r e n r eFourier Synthesis 1. The equatiun explained 2. Applications a. Location of missing atoms: fragments, solvent, and hydrogens b. Criteria for correctness D. Thermal Parameters and Disorder VII. Structure Refinement A. The Function Minimized B. The Normal Equations 1. Overdetermination of the problem: datalvariahles 2. Solution: full matrix vs. block approximations 3. Derived quantities: estimated standard deviations (e.s.d.'s) C. Weighting schemes D. Criteria for convergence 1. Shifts1e.s.d.'~ 2. Rand Rw: The bottom line VIII. Structure Solution: Macromolecular Style (A Whole Different Ballgame) A. Isomorphous Replacement 1. Heavy atom derivatives: how isomorphous are they? 2. Multiple dataset collection: practical considerations 3. The difference Patterson synthesis and heavy atom location 4. Phase determination a. Method of Blow and Crick h. Figures of merit, merges, and residuals B. Other Techniques 1. Anomalous dispersion 2. Molecular replacement C. Fourier Map Interpretation 1. Resolution: the bottom line a. Series termination in the Fourier synthesis: the effects of limited data b. Model fitting: the value of the sequence c. Distinguishing data from the model: interpretations of 4-, 3-, and 2-A maps 2. The difference Fourier map in maeromolecular crystallography: substrate and analogue binding D. Macromolecular Refinement 1. Resolution and refinement: the datalvariable prohlem 2. Constraints and restraints: the method of Konnert and Hendrickson 3. Distinguishing data from the model: what's refined, what's not 4. The three "R's" a. What is the resolution? h. Was the structure refined? c. What are the residuals? IX. Structure Results A. Finding Structures 1. The Journals a. From Actn to Science Volume 65

Number 6

June 1988

511

h. Using Chem Abstracts and Index Medicus 2. The Databases a. Cambridge Structural Database b. Protein Databank B. Presenting Structures 1. Tabulated results a. Coordinates h. Bond lengths, angles, torsion angles c. Thermal parameters d. Other geometric parameters 2. Graphics: A Picture's Worth a Thousand Numbers a. Software: ORTEP to the Richardson representation h. Hardware: vector and raster graphics; state-ofthe-art tools

512

Journal of Chemical Education

C. Interpreting Structures: Ancillary Techniques 1. Computational Techniques a. Alternative displays: distance matrix plots, solvent accessible surfaces. etc. b. Energy calrulstims: a h initio, semi-empirical, and molecular merhanlcs mmimizers c. Mc,dcling dynamic pnmsser: ducking, Monte Carlo, and molecular dynamics methods 2. Experimental Techniques: No Structure is an Island a. Integration of structure results with data from other sources: spectroscopic, biochemical, pharmacological h. Complementary studies: examples