An introduction to X-ray structure determination

Rochester, New York. An Introduction to X-Ray. Structure Determination. \N\th the advent ofmodern high speed digital computers, X-ray crystallography ...
0 downloads 0 Views 7MB Size
Jon A. Kopecki Eastman Kodak Company Rochester, N e w York

An Introduction to X-Ray Structure Determination

W i t h the advent of modern high speed digital computers, X-ray crystallography has become an easily accessible structural tool for organic and inorganic chemists, even for those with limited mathematical background or interests. Certainly structural crystallography, with the possible exception of mass spectroscopy, gives more information per sample size than any other physical method. In addition to primary structure, diffraction analysis yields information on non-bonded interactions, conformation effects, thermal vibrations, and in favorable cases, the absolute configuration of optically active molecules. The use of computer controlled diffractometers has not only increased accuracy of the structural parameters, but has also reduced the time needed for data collection so drastically that complete structures have been determined in less than 48 hr. Regrettably most courses in physical methods for organic or inorganic chemists mention crystallography only in passing, probably because of unfamiliarity of modern methods to many chemists and the mathematical formalism which surround the subject. There is, however, no reason why a one or two lecture introduction, directed at showing when and how the method can be applied, should not be a part of all physical methods courses. There is no need to turn the organic chemist into a crystallographer, only to remove his apprehensions at talking to one. What follows is an outline of what this author feels can and need be taught in such an introductory, basically non-mathematical, first exposure. Some traditional topics, such as Miller indices, space grocp symmetry, and reciprocal space are not explicitly considered here although they could be easily incorporated as student background and time permits. The recent surge of interest in crystallography by non-crystallographers has brought with it a large number of excellent introductory texts (1-7) which can lead the interested chemist to a further understanding. Computers and Crystallography

Despite early developments in the theory of X-ray diffraction (8, 9),the technique remained a somewhat specialized tool of the physicists, metallurgists, and physical chemists until the mid 1950's when modern digital computers became commonplace. The mathematical techniques necessary to solve diffraction problems had themselves been developed before X-rays were known, but the sheer number of such calculations made structural work unbelievably formidable. As an illustration, assume that we wished to examine the three dimensional structure of a crystal with a unit cell (vide infra) of size 6 X 10 X 10 A and that we

needed to "sample" the cell every 0.25 A. If we measured 1000 diffraction intensities (a modest number for a small organic structure), the .lumber of calculations required to produce a single electron density map would amount to [(lO)(l/O.ZS) X (lO)(l/O.Z5) X (6)(1/0.25)1(1OOO)

i.e., roughly 38 million calculations. It is important to note that not just one, but several such sets of calculations are usually necessary to solve a structure. Consequently, most of the earlier work was done as two-dimensional projections, reducing the number of calculations in the example above to a more manageable but none the less tedious one million. Method

Strnctural studies are nearly always done on a single crystal. A crystal, according to one definition, is a body of homogeneous matter composed of identical small units repeating indefinitely in all directions. These smallest repeating units are called unit cells and can contain one or more molecular aggregations related to each other by simple symmetry elements (mirror plane, screw axis, etc.). Once the position of any one of these molecular aggregations or asymmetric units is located, the others can be easily determined by application of the symmetry elements. The material to he studied should be crystalline at "reasonable" temperature (low temperature equipment cooled by liquid nitrogen, for example, is available) and pure at the molecular, ionic or atomic level.' Although there is considerable latitude in crystal size, too small a crystal will not diffract strongly enough and too large a crystal may suffer from absorption or extinction problems. A range of 0.1-0.3 mm on a side (or in diameter) is about right and larger crystals can often be cut or shaped to this size. The crystal is mounted on a glass fiber and aligned in a beam of nearly monochromatic X-radiation. Most of the X-rays pass through the crystal unaffected and are therefore uninteresting for structure determination. Others, however, are diffracted by the electrons of the atoms in the crystal and can be detected as "spots" of radiation by either a sheet of film or a scintillation counter. Each spot may be thought of as arising from the diffraction of X-rays from a set of planes in the crystal labeled by the ordered triplet (hkl). The distance hetween these spots is related to the distance between Actually this restriction of purity is, in practice, too severe. The structures of co-crystdlites, crystals containing solvent molecules, and other such "impure" species are routinely solved.

Volume 49, Number 4, April 1972

/

231

Figure 2. A commcrciol outomded computer-controlled diffroaometer system. The cryrtd is mounted in the X-roy beom in the diffroctometer (on table, upper left). The motion of the crystal ond counter and the accumulation of doto are controlled b y the computer (below chert recorder, upper right). D d o output is via the teletype unit (right1 on both poper and poper tape. (Photo courtesy Picker Corp.l

Figure 1. A photograph of the X-ray diffraction from the hkO planes of an organic s m p w n d . An X-ray impermeable screen prevents diffraction from other hkl nets fmm reaching the film. The intensity of the diffrodion spots can b e measured with a cdibroted film strip.

the planes which in turn can be related to the size of the unit cell. The intensity of the diffracted wave and the relative time of its arrival at the detecting device (its "phase") are determined by the distribution of atoms within the crystal and the resulting constructivedestructive interferences within each set of diffracted waves. It is not possible to record the entire three-dimensional hkl array on a single sheet of film in a form suitable for analysis as many of the spots would overlap. Instead, a screen impermeable to X-rays is placed between the crystal and film to isolate a single twodimensional net (as hkl). In this way a set of photographs (Fig. 1) comprising the entire (or nearly entire) hkl set can be obtained (hM), hkl, hk2, etc.). The relative intensities of each spot-there can be as many as several thousand-can be measured by comparison with a calibrated film strip (a tedious job) or by using a photodensitometer. The entire process of taking the photographs, measuring the intensities, and transforming the data to a form suitable for computer processing can take from several days to-more likelyseveral weeks. A diffractometer, on the other hand, measures the diffraction intensities by moving a scintillation counter to a point on that line in space where a particular hkl diffracted beam is expected to lie and recording the radiation as a function of time. The counter (and/or the crystal) is then moved to a new position and the intensity associated with another set of hkl planes is recorded. Diffractometers (Fig. 2) range in sophistication from the simplest manual instruments, in which the diffracting positions must first be calculated, the counter moved manually and the intensity recorded by hand, to fully automated equipment (10). The latter usually has its own digital computer which calculates the positions for diffraction, sends the commands necessary to move the crystal or counter appropriately and records the data in a form suitable for computer 232

/

Journal of Chemical Educofion

processing (or even does the processing itself). Data collection on such an instrument may be as short as 24 hr, though usually i t amounts to several days, and operator intervcntion is rarely required. Data from a diffractometerare also usually more accurate than those obtained by film methods. Such convenience, speed, and accuracy has its price, of course. While the equipment necessary to do film work, an X-ray generator and two cameras, may cost about $10,000-12,000, a fully automated diffractometer falls more in the range of $70,000-100,000, plus several thousand dollars in yearly maintenance costs for some fairly sophisticated equipment (10). Solving the Slructure

With the data now in hand, a function rclating these ohservables to the relative value of the electron density (p) a t any point x, y, z within the unit ccll would be most convenient. Happily, such function almost exists p(x,y,z) =

1

xxx [RCZ +"

-m

exp ( - 2 r i ( h z

+ ky + k))]

hkl

where V is the volume of the unit cell, i is 47, and Fhkl is a complex number called the structure factor which can be related to the observed diffraction intensity, Inn,. Iax~a l Fhtt l 2 This expression for the electron density is nothing more than a simple summation (11), and thus it would appear that to determine the electron density at any point in the unit ccll and hcnce the structure, we need only plug in x, y, z and sum over all values of F w . X-ray crystallography would then be reduced to a trivial exercise in arithmetic. All is not that simple, however, though theoretical crystallographers continue to strive for the day when it will be. Both film and scintillation counters are "square law detectors," that is, though they can record the amplitude of thc diffracted wave they cannot measure the relative time of its arrival at the detector. This phase information, the complex part of Fnxz,is what is lost in the conversion to its squared form, the observed intensity, Inez. In general, any diffracted wave can

be from 0' to +18O0 out of phase with any other. But in those cases where the unit cell possess a center of symmetry as one of its symmetry elements (a "centrosymmetric" unit cell),l the phase angle reduces to Oo or 18O0, or more simply or - . 3 The square root of IAil,of course, admits to either possibility. It is the regeneration of this lost phase informationthe so-called "phase problemn-towards which most of the crystallographers' effort is expended. Though there are a number of techniques, both historical and current, applicable to solving the phase problem, we will look-and only briefly-at the two most commonly used today in actually solving structures.'

+

Figwe 3. One slice or section of a contoured Patterson vector map of o bromine-containing organic compound. The bromine-bromine vector readily standr out from the less intense carbon-bromine and carbon-carbon vectors. From the position of this and other bromine-bromine vecton, the losotion d the bmmine atom in the unit cell can be calculated.

Patterson or Heavy Atom Technique

There is, as the reciprocity of nature would have it, a simple function called the structure factor (SF) equation which would allow us to calculate the signed value of FAiI (we will consider, for the moment, only the centrosymmetric case) all atoms

FAX,=

fn

exp [ 2 r i ( h z ,

+ ky. + lr,)]

r&

where f, is the tabulated scattering factor of atom n, a measure of the atom's ability to scatter X-rays at a given scattering angle and related to its number of electrons. The structure factor equation, however, involves a sum over the positions of all atoms (x,, ,.y z,), information which we are not very likely to possess if we are really trying to solve the structure! Fortunately, the SF equation is not just a sum, but a weighted sum. That is, the heaviest atoms have the largest scattering factors and hence make the greatest contributions to the value of F. In addition, we do not need to calculate the precise value of F from the SF equation-we already have this as an observablebut only the sign of F. Thus, for a bromine-containing organic compound, we would assume that the bromine atom would be the major contributor to the SF sum, and if we knew its location, we could calculate signed F's from a truncated version of the SF equation

between all atoms in the unit cell. Such a map would be a feature-less mess of many6 overlapping similar vectors were it not for one other factor: the magnitude of the atom-atom vectors are proportional to the product of the atomic numbers involved. Thus a vector between two bromines in the unit cell would be (35, 35)/(6.6) or -34 times more intense than a carboncarbon vector and hence easily discernible (Fig. 3). From the heavy atom vectors, it is usually a simple matter to calculate the location of the heavy atom itself. While we have considered only the centrosymmetric case, the heavy atom method is equally applicable to unit cells without a center of symmetry. Direct Methods

Crystallographers have always looked for ways to free themselves from conditions imposed by heavy atom methods. Although it is often a simplr task to convert an organic compound into a heavy atom de"n actual practice, the probability is somewhat better than 3 to 1 that the unit cell will contain a center of symmetry ( 1 s ) . This probability is reduced for natural products where optically active cumpounds are prominent as it is impossible for a unit cell to contain but a single enantiomer and a center of symmetry. Conversely, for synthetic materials, the probability of a centrosymmetric cell is greater than 3 to 1. a The complex number F h t t can be viewed as a vector with a iBw). The phase angle real and imaginary component ( A M (LI) then is simply the angle between this vector and the real axis or by trigonomet,ry,a = arctan ( B l A ) . For 8. centrosymmetric unit cell with the origin at the center of symmetry, B is always zero. In either case, the observed intensity is proportional to square root of

+

Certainly the magnitude of I F A ~will I ~ be wrong, but we may assume that the sign of F will more often than not be correct. We can then combine the calculated signs of F with the correct observed magnitude of F i n the electron density equation to get an approximate electron density map. From this map we should be able to pick out the locations of additional atoms, and use these to calculate better signs for Fokt. This process can be continued in an iterative manner until all atoms have been located. As we increase the number of terms in the SF summation, we should expect the magnitudes of F calculated to approach those of F observed, and indeed such a comparison, called the R-Factor6 can serve as a measure of our progress. How then do we locate the heavy atom, the key to this entire process? In 1935, the New Zealand-born physicist and crystallographer, A. L. Patterson, published a classic paper ( I S ) demonstrating that the observed intensities, as lFIZcould be used in an equation much like the electron density equation to give a map not of electron density, but of the interatomic vectors

(AM

+ iBhtt) ( A I M- i B d = A a w + B%H =

IFhd2

For a qualitative explanation of the relationship between the eentrosvmmetric and non-centrosvmmetric cases. see reference (11). 4 These other methods include trial and error, modified Monte Carlo, packing analysis, isomorphous replacement, anomalous dispersion, Fourier transform, and superposition techniques. A lucid yet brief discussion of most of these methods cam he found in reference ( 1 ) . The R-Factor is often defined as the percent difference

with the summations over d l reflections. ' For a. 10 atom structure with 4 molecules per unit cell, there would he (40)== 1600 auch vectors, of which 40 would be piled up at the origin (vectors from the stom to itself) leaving us 1560 with which to contend. Volume 49, Number 4, April 1972

/

233

rivative or salt, occasionally it may prove difficult or indeed obviate the entire need for the analysis. Furthermore, since in theory a t least all the information needed to solve the structure is hidden somewhere in the data, more direct methods of analysis have always remained an intriguing challenge. One of the more widely used of these so-called direct methods has involved the probability equations of Karle, Hauptman, and Sayre (14, 16). Although r e cent work has extended these computations to noncentrosymmetric crystals (16), it is still more generally applicable to the centrosymmetric case, and this discussion will be so limited. The probability equations give the probable sign of one reflection in terms of the signs of other, related reflections. For example, if we let positively phased reflections be designated by +1 and negatively phased ones by -1, then the probable phase of an unknown reflection Fax,is related to the product of the phases of the reflections Fh,~,,,and FLB,. 2-it. LV. The table shows how the sign of reflection 612 would be determined by the signs of two related reflections.'

least number of internal inconsistencies is considered to be the most probable assignment. In the example above, for instance, the assignment A = +1, B = -1, C = -1, D = - 1 would give opposite phases for Feltdepending on how i t was determined. Numerous other combinations, say A = +1, B = -1, C = - 1, D = +1, would generate the same sign for 612 for both sets of reflections. Which of these alternative assignments is most probable depends on the number of inconsistencies generated with other reflections. This sort of tedious cross-checking can fortunately be handled with ease by a computer which can be directed to output a list of starting phase assignments in order of decreasing probability (17). Slructure Refinement

Once the approximate positions of all the atoms have been gleaned from an electron density map (Fig. 4), we can go about the task of obtaining the best fit between the molecnlar parameters and the observables.

Determination of the Sign of an Unknown Reflection

Known Phases

Fm

Unknown Phase

Fm

Fa.

Table shows how the sien of an unknown reflection 612 (hkl) would be determined by