New Opportunities for Structure Determination of ... - ACS Publications

Jump to Combined Use of Powder Diffraction Data and Energy in Structure ... - We note that other approaches, differing from the concept of the guiding...
0 downloads 9 Views 366KB Size
New Opportunities for Structure Determination of Molecular Materials Directly from Powder Diffraction Data Kenneth D. M. Harris*

CRYSTAL GROWTH & DESIGN 2003 VOL. 3, NO. 6 887-895

School of Chemistry, University of Birmingham, Edgbaston, Birmingham B15 2TT, United Kingdom Received April 21, 2003

ABSTRACT: Although single crystal X-ray diffraction is a powerful technique for the determination of crystal and molecular structures, many solids can be prepared only as microcrystalline powders and therefore cannot be studied by single crystal diffraction techniques. For such materials, it is necessary to tackle structure determination using powder diffraction data. This article highlights recent developments in the opportunities for determining the crystal structures of molecular solids directly from powder diffraction data, focusing on the challenging structure solution stage of the structure determination process. In particular, the direct-space strategy for structure solution is highlighted, as this approach has led to significant recent advances in the structure determination of molecular solids. In the direct-space approach, a hypersurface defined by an appropriate powder diffraction R-factor is explored using global optimization techniques, and we focus on our development and application of Monte Carlo and genetic algorithm techniques within this field. Fundamental aspects are described, and examples are given to illustrate the application of the direct-space strategy to determine crystal structures of molecular materials. 1. Introduction Determination of crystal structures from single crystal X-ray diffraction data is without question the most powerful approach for elucidating structural information in chemistry, and many important scientific advances during the 20th Century relied upon the use of this technique. However, the requirement for single crystals of appropriate size and quality imposes a natural limitation on the scope of this technique, as many materials of interest can be prepared only as microcrystalline powders. How do we progress toward understanding the structural properties of such materials? The most direct route is to use powder diffraction data. Although single crystal and powder diffraction patterns contain the same intrinsic information, in the former case this information is distributed in threedimensional space, whereas in the latter case the threedimensional diffraction data are “compressed” into one dimension, which generally leads to considerable overlap of peaks in the powder diffraction pattern. Such peak overlap obscures information on the intensities of individual diffraction maxima, and constitutes the main reason for difficulties in determining crystal structures directly from powder diffraction data. These difficulties are often particularly severe in the case of molecular solids, which typically have large unit cells and low symmetry, leading to a high density of peaks in the powder diffraction pattern and hence substantial peak overlap. In view of the fact that many important materials can be prepared only as microcrystalline powders, the availability of new and increasingly powerful procedures for determining crystal structures directly from powder diffraction data1-6 has the poten* To whom correspondence should be addressed. Telephone: +44121-414-7474.FAX: +44-121-414-7473.E-mail: [email protected].

tial to make considerable impact in structural sciences. For these reasons, we are focusing on the development, implementation, and optimization of new techniques for structure solution from powder diffraction data, with emphasis on tackling the specific challenges encountered for molecular solids. Our methods are based on the direct-space strategy,7 in which a hypersurface defined by the powder profile R-factor (Rwp) is searched using Monte Carlo (MC)7 or genetic algorithm (GA)8-10 techniques. This article gives an overview of the problems and challenges associated with structure determination from powder diffraction data, with some focus on our own recent contributions in this field. More detailed reviews covering all aspects of structure determination from powder diffraction data may be found in refs 1-6. First, it is important to distinguish the different stages involved in crystal structure determination from diffraction data (either single crystal or powder), which are as follows: (i) unit cell determination and space group assignment, (ii) structure solution, and (iii) structure refinement. The aim of structure solution is to obtain an initial approximation of the structure from analysis of the experimental diffraction data, using the unit cell and space group determined in stage (i), but starting with no knowledge of the actual arrangement of atoms or molecules within the unit cell. If the structure solution is a sufficiently good approximation to the true structure, a good quality structure can then be obtained by structure refinement. For powder diffraction data, structure refinement is now carried out fairly routinely using the Rietveld profile refinement technique,11,12 and unit cell determination is carried out using standard indexing procedures (see, for example, refs 13-17). In this paper, we focus on structure solution, which is generally the most challenging stage of the structure determination process.

10.1021/cg030018w CCC: $25.00 © 2003 American Chemical Society Published on Web 07/04/2003

888

Crystal Growth & Design, Vol. 3, No. 6, 2003

The techniques currently available for structure solution from powder diffraction data can be subdivided into “traditional” and “direct-space” approaches. The traditional approach follows a close analogy to conventional procedures for analysis of single-crystal diffraction data, whereas the direct-space approach follows a close analogy to global optimization procedures, which find applications in many areas of science. Indeed, our initial work on the development of the direct-space strategy7 originated from identifying the opportunity to combine our existing experience in computer simulation of solids (involving global optimization based on consideration of energy)18 together with our experience in the application of traditional techniques for powder structure solution.19 In the traditional approach, the intensities I(hkl) of individual reflections are extracted directly from the powder diffraction pattern, and the structure is then solved using these I(hkl) data in the types of structure solution calculation that are used for single crystal diffraction data (e.g., direct methods or Patterson methods). However, as there is usually extensive peak overlap in the powder diffraction pattern, extracting reliable values of the intensities I(hkl) of the individual diffraction maxima can be problematic, and may lead to difficulties in subsequent attempts to solve the structure using these “single-crystal-like” approaches. Overcoming this problem requires either improved techniques for extracting reliable integrated peak intensities I(hkl) (an area of active interest in several research groups), or the use of new structure solution strategies (see below) that allow the experimental powder diffraction profile to be used directly in its “raw” digitized form, without the requirement to extract the intensities I(hkl) of individual diffraction maxima. In the direct-space approach, trial structures are generated in direct space, independently of the experimental powder diffraction data, with the suitability of each trial structure assessed by directly comparing the powder diffraction pattern calculated for the trial structure and the experimental powder diffraction pattern. This comparison is quantified using an appropriate R-factor. Our implementations of the direct-space strategy have used the weighted powder profile R-factor Rwp (the R-factor normally employed in Rietveld refinement). Importantly, Rwp considers the entire digitized intensity profile point-by-point, rather than the integrated intensities of individual diffraction maxima, and Rwp thus takes peak overlap implicitly into consideration. Furthermore, Rwp uses the digitized powder diffraction data directly as measured, without further manipulation of the type required when using figures-of-merit that involve the extraction of individual peak intensities from the powder diffraction pattern. Nevertheless, we note that some other implementations of direct-space approaches have instead used figures-of-merit based on extracted peak intensities. The basis of the direct-space strategy for structure solution is to find the trial crystal structure corresponding to lowest R-factor, and is equivalent to exploring a hypersurface R(Γ) to find the global minimum, where Γ represents the set of variables that define the structure. In principle, any technique for global optimization may be used to find the lowest point on the R(Γ)

Review

hypersurface, and much success has been achieved in using the MC,7,20-26 simulated annealing,27-35 and GA8-10,36-44 methods in this field. In addition, grid search45-49 and differential evolution50 methods have also been employed. This article is focused on fundamental and applied aspects of our implementations of MC and GA techniques within direct-space structure solution from powder diffraction data, with particular emphasis on the application of these techniques to elucidate structural properties of molecular solids. The first previously unknown organic molecular crystal structure to be solved using the traditional approach (employing direct methods for structure solution) was formylurea,19 although the successful structure solution of the known structure of cimetidine had been demonstrated previously (also using direct methods).51 The first material of unknown crystal structure to be solved by a direct-space approach was p-BrC6H4CH2CO2H,7 using the MC method. Following this early work, the structures of a wide range of other molecular solids have now been determined directly from powder diffraction data, and have been surveyed in a recent review article.6 2. Direct-Space Strategy for Structure Solution In the direct-space strategy for structure solution from powder diffraction data, the structure is defined by a “structural fragment” representing the atoms (or a subset of the atoms) in the asymmetric unit. The structural variables (i.e., the set Γ discussed above) represent the position, orientation, and intramolecular geometry of each molecule in the asymmetric unit. The position is defined by the coordinates {x, y, z} of the center of mass or a selected atom, and the orientation is defined by rotation angles {θ, φ, ψ} around a set of orthogonal axes. In general, the bond lengths and bond angles are fixed (either using standard values for the type of molecule under study or using the known geometry of a similar molecule), and the intramolecular geometry is specified by a set of variable torsion angles {τ1, τ2, ..., τn} that define the molecular conformation. Thus, in general, there are 6+n variables, Γ ) {x, y, z, θ, φ, ψ, τ1, τ2, ..., τn}, for each molecule in the asymmetric unit. As discussed above, the direct-space strategy involves exploring the R(Γ) hypersurface to locate the set of variables Γ that give the best fit (lowest R-factor) to the experimental powder diffraction pattern. Given the stochastic nature of most direct-space techniques, it is good practice to repeat the structure solution calculation several times from different random starting points. Obtaining the same structure solution from these independent calculations provides strong support that the true global minimum on the R(Γ) hypersurface (i.e., the correct structure solution) has been found. The use of Rwp to assess the correctness of a structural model, as in our implementations of the direct-space approach, requires that the parameters defining the peak positions (unit cell parameters and zero-point offset), the peak shape (peak width, peak shape function and peak mixing parameters), and the background intensity in the powder diffraction pattern calculated for each trial structure accurately reflect the experimental powder diffraction pattern. In general, reliable

Review

Crystal Growth & Design, Vol. 3, No. 6, 2003 889

values of these parameters can be determined, prior to structure solution, by fitting the powder diffraction pattern with the use of arbitrary peak intensities (i.e., without using any structural model to determine the peak intensities), for example, using the Pawley fitting procedure.52 Careful prior analysis of the experimental data in this way ensures that the profile parameters used subsequently in structure solution calculations provide a reliable description of the experimental powder diffraction pattern. 3. Monte Carlo/Simulated Annealing Technique In the MC/simulated annealing technique,7 a sequence of structures (denoted Γi for i ) 1, 2, ..., N) is generated for consideration as potential structure solutions. Each structure is derived from the previous structure by a random displacement of the structural fragment within the unit cell (involving small displacements in the values of each of the variables in Γi). The procedure for generating structure Γj+1 from structure Γj is summarized as follows. Starting from structure Γj, a trial structure Γj,trial is generated by making small random displacements to each of the structural variables in Γj. The agreement between the powder diffraction pattern calculated for the trial structure and the experimental powder diffraction pattern is then assessed by calculating an appropriate R-factor, such as Rwp. The trial structure is then accepted or rejected by considering the difference [Z ) R(Γj,trial) - R(Γj)] between the values of R-factor for structures Γj,trial and Γj and invoking the Metropolis importance sampling algorithm.53 Thus, if Z e 0, the trial structure is automatically accepted, whereas if Z > 0, the trial structure is accepted with probability exp(-Z/S) and rejected with probability [1 - exp(-Z/ S)], where S is an appropriate scaling factor. If the trial structure is accepted, structure Γj+1 is taken to be the same as Γj,trial. If the trial structure is rejected, structure Γj+1 is taken to be the same as Γj. The parameter S may either be fixed or varied in a controlled manner during the calculation. The higher the value of S, the higher the probability of accepting trial structures for which Z > 0. This procedure is repeated to generate a large number of structures, with each structure derived from the previous structure through small random displacements in the values of the variables in the set Γ. After a sufficient number of structures has been generated, representing a sufficiently extensive sampling of the R(Γ) hypersurface, the best structure solution (corresponding to lowest R-factor) is identified and is considered as the starting model for structure refinement. It is important to emphasize that the MC/simulated annealing method does not represent minimization of R-factor (except if S ) 0), but explores the R(Γ) hypersurface in a manner that gives emphasis to regions with low R-factor, but with the ability to escape from local minima in R-factor. The essential distinction between the MC and simulated annealing techniques is the way in which the parameter S is used to control the sampling algorithm. In the MC method, S is either fixed or varied manually, whereas in simulated annealing, S is decreased systematically according to a well-defined annealing sched-

Figure 1. Flowchart representing the procedure for evolution of the population from one generation (population Pj) to the next generation (population Pj+1) in the GA technique for powder structure solution.

ule or temperature reduction procedure.54 Different implementations of simulated annealing methods in this field27-35 illustrate a range of different ways of handling the annealing procedure. 4. Genetic Algorithm Technique The GA technique10,55 is based on the principles of evolution and involves familiar evolutionary operations such as mating, mutation, and natural selection. An important feature of the GA technique is that it operates in a parallel manner, with many different regions of the R(Γ) hypersurface investigated simultaneously. Furthermore, information concerning these different regions of the R(Γ) hypersurface is passed actively between different members of the population by the mating procedure. Clearly, the parallel nature of the GA technique confers additional efficiency in searching the R(Γ) hypersurface. Our GA technique8-10,36-42 for structure solution from powder diffraction data has been implemented in the program EAGER,56 and a flowchart describing the operation of this program is shown in Figure 1. The GA structure solution strategy investigates the evolution of a population of trial structures, with each member of the population defined by a set of variables Γ, as discussed above. As each member of the population is uniquely characterized by the values of these variables, the set Γ can be regarded to define its “genetic code”. The initial population Po comprises Np randomly generated structures. The population is then allowed to evolve through subsequent generations by applying the evo-

890

Crystal Growth & Design, Vol. 3, No. 6, 2003

lutionary operations of mating, mutation, and natural selection. Through these operations, a given generation (population Pj) is converted to the next generation (population Pj+1). The number Np of structures in the population is constant for all generations, and Nm mating operations and Nx mutation operations are performed during the evolution from population Pj to population Pj+1. The quality (“fitness”) of each structure depends on its value of R-factor (lower R-factor represents higher fitness), and it is advantageous to define fitness as an appropriate decreasing function of Rfactor.10 In the mating procedure, a given number (Nm) of pairs of structures (“parents”) are selected from the population. The probability of selecting a given structure as a parent is proportional to its fitness. For each pair of parents, two new structures (“offspring”) are generated by distributing parts of the genetic codes of the two parents among the two offspring. As a simple example, for the case of a rigid molecule defined by the structural variables {x, y, z, θ, φ, ψ}, one method for carrying out mating is to exchange the positional {x, y, z} and orientational {θ, φ, ψ} variables between the two parents. Thus, the two selected parents {xa, ya, za, θa, φa, ψa} and {xb, yb, zb, θb, φb, ψb} would give rise to the two offspring {xa, ya, za, θb, φb, ψb} and {xb, yb, zb, θa, φa, ψa}. For systems involving a greater number of variables, more complex rules may be adopted for the mating procedure. It is important to recognize that the mating operation generates new structures by redistributing the existing genetic information in different ways, but does not actually create any new values of the individual genetic variables. New values of the genetic variables are instead introduced into the population by the mutation procedure, in which a given number (Nx) of structures are selected at random from the population and random changes are made to parts of their genetic code to create mutant structures. The changes that are made to selected variables in generating the mutants may either be new random values (static mutation) or small random displacements from the existing values (dynamic mutation). The original structures from which the mutants are derived are still retained within the population. In the natural selection procedure, only the structures of highest fitness (lowest R-factor) are allowed to pass from one generation to the next generation in the GA calculation. After the population has evolved for a sufficiently large number of generations, the structure with lowest R-factor should be close to the correct structure. In a recent implementation of the GA method,36 each new structure generated during the GA calculation is subjected to local minimization of Rwp with respect to the structural variables in the set Γ, and only these minimized structures are used subsequently in the GA calculation. Introduction of local minimization in this way has been found to improve the efficiency of finding the correct structure solution, and the reliability and reproducibility in terms of finding the correct structure solution (for example, in repeated runs from different random initial populations) is also substantially improved. These advantages may be attributed to a

Review

favorable combination of stochastic (i.e., the GA) and deterministic (i.e., the minimization) components within this global optimization strategy. As the minimization procedure modifies the genetic characteristics of each structure sampled on the Rwp hypersurface, depending on the nature of its local environment, the GA method incorporating local minimization represents Lamarckian (rather than Darwinian) evolution. In developing strategies for implementing GA techniques in structure solution, there are many different ways in which each of the evolutionary operations could be carried out and many different ways in which the sequence of evolutionary events (as in the flowchart in Figure 1) could be controlled. There is therefore much scope for developing an optimized implementation of the GA methodology. Indeed, our current implementation of the GA technique includes several features that extend beyond the simple description of the GA method summarized above. 5. Combined Use of Powder Diffraction Data and Energy in Structure Solution As energy (E) and R-factor hypersurfaces for a molecular crystal are based on the same parameter space Γ, but have different characteristics,57 there is a direct opportunity to combine information on both E(Γ) and Rwp(Γ) to generate hybrid hypersurfaces G(Γ). Importantly, if designed appropriately, the characteristics of such hybrid hypersurfaces may be more desirable than either of the individual E(Γ) or Rwp(Γ) hypersurfaces, with regard to searching and locating the global minimum within the context of direct-space structure solution techniques. With this aim, we have proposed and implemented58 a specific definition of G(Γ) that combines desirable characteristics from both the E(Γ) and Rwp(Γ) hypersurfaces. Specifically, our hybrid function G(Γ) is designed to behave as E(Γ) when the value of E(Γ) is high and to give increasing importance (ultimately absolute importance) to Rwp(Γ) as lower values of E(Γ) are approached. In practice, this behavior is achieved by use of a sliding weighting parameter that is a decreasing function of E(Γ). This hybrid figure-of-merit is defined as

G(Γ) ) [1 - w(Γ)]EN(Γ) + w(Γ)RN(Γ) where w(Γ) is the weighting function, and EN(Γ) and RN(Γ) denote normalized energy and R-factor respectively, with 0 e w(Γ) e 1, 0 e EN(Γ) e 1 and 0 e RN(Γ) e 1. The use of normalized functions allows energy and R-factor to be combined in a straightforward and rational manner. As w(Γ) is a decreasing function of energy, we note that w(Γ) f 0 when energy is high and w(Γ) f 1 when energy is low. We use the term “guiding function” to refer to such figures-of-merit in which one property (here E) is used to guide another property (here Rwp) toward its optimal value. The above definition is such that G(Γ) behaves as energy when the value of energy is high and behaves increasingly like Rwp as lower values of energy are approached. Thus, E(Γ) effectively guides the calculation toward regions of structural space Γ corresponding to energetically plausible structures, with Rwp(Γ) then becoming progressively more important as the criterion

Review

Crystal Growth & Design, Vol. 3, No. 6, 2003 891

Figure 2. (a) Molecular structure of L-glutamic acid in the zwitterionic form. (b) Structural fragment used in the GA structure solution calculations for L-glutamic acid, showing the variable torsion angles.

for discriminating the correct structure solution. In general, we find that the progress of GA structure solution calculations using G(Γ) represents a more systematic and controlled evolution of the population than that typically observed in corresponding calculations using Rwp(Γ) alone, and this behavior can be advantageous in avoiding potential problems due to stagnation of the population in GA calculations. We note that other approaches, differing from the concept of the guiding function described above, have also been used for incorporating energy information within the process of structure determination from diffraction data.59,60 6. Demonstration of the Application of the GA Structure Solution Method for Known Structures of a Polymorphic System: The r and β Phases of L-Glutamic Acid L-Glutamic acid exists in two different polymorphic forms, denoted the R and β phases, the crystal structures of which have been determined previously.61,62 In both cases, the L-glutamic acid molecule is in the zwitterionic form (Figure 2a). Both structures have space group P212121 with one molecule in the asymmetric unit, and the unit cell parameters are a ) 10.28 Å, b ) 8.78 Å, c ) 7.07 Å for the R phase and a ) 5.16 Å, b ) 17.30 Å, c ) 6.95 Å for the β phase. The powder X-ray diffraction patterns for the R and β phases of L-glutamic acid were recorded at ambient temperature in transmission mode on a Siemens D5000 diffractometer, using Ge-monochromated CuKR1 radiation and a linear position-sensitive detector covering 8° in 2θ. In our GA structure solution calculations for the R and β polymorphs, the structural fragment comprised all non-hydrogen atoms of the L-glutamic acid molecule. Standard molecular geometry (bond lengths and bond angles) were assumed, with all C-O bond lengths taken to be equal (the C-O single and CdO double bonds are readily assigned subsequently during Rietveld refinement). The four torsion angles {τ1, ..., τ4} defining the molecular conformation are indicated in Figure 2b. Specific details of the GA implementation used in these structure solution calculations is given elsewhere.63 The progress of a GA structure solution calculation can be assessed from the evolutionary progress plot, which shows the best (Rmin) and average (Rave) values of Rwp for the population as a function of the generation number. Figure 3 shows the evolutionary progress plots

Figure 3. Evolutionary progress plots showing the evolution of Rave (filled circles) and Rmin (open circles), as a function of generation number, in the GA structure solution calculations for (a) the R phase and (b) the β phase of L-glutamic acid.

for typical GA structure solution calculations for the R and β phases of L-glutamic acid, demonstrating clearly that, in each case, the GA structure solution calculation converges rapidly on a structure with low Rwp. The best structure solution obtained for the R phase is shown in Figure 4a, and the best structure solution obtained for the β phase is shown in Figure 4b. In each case, the known crystal structure61,62 is also shown for comparison. For both the R and β phases, the structure solution generated by the GA calculation is in excellent agreement with the known structure (the maximum distance between an atom in the structure solution and the corresponding atom in the known crystal structure is less than 0.5 Å), and the structure solution refines readily to the known structure upon Rietveld refinement. We note that the conformation of the L-glutamic acid molecule is significantly different in the R and β phases, and the GA structure solution calculations have successfully found the correct conformation in each case. 7. Examples of Structure Determination of Molecular Solids of Unknown Structure from Powder X-ray Diffraction Data: Structural Rationalization of Oligopeptides While our techniques for structure determination from powder diffraction data have been applied to a wide range of different types of molecular solids (see, for example, the references cited in section 1), we illustrate the application of these methods by focusing on three examples of structure determination of oligopeptides. Much of the motivation for understanding the structural properties of such materials derives from the fact that knowledge of the conformational properties and interactions in oligopeptides can provide important insights concerning structural properties of polypeptide

892

Crystal Growth & Design, Vol. 3, No. 6, 2003

Review

Figure 5. Molecular structure of Phe-Gly-Gly-Phe, showing the variable torsion angles in the GA structure solution calculation.

Figure 4. Comparison between the position of the structural fragment in the best structure solution obtained in the GA structure solution calculation (light shading) and the positions of the corresponding atoms in the known crystal structure (dark shading) for (a) the R phase and (b) the β phase of L-glutamic acid.

sequences in proteins. However, in many cases the target materials do not form single crystals appropriate for single crystal X-ray diffraction studies, such that structure determination from powder diffraction data represents the only viable route toward structural understanding and rationalization. In each of the examples discussed below, we used powder X-ray diffraction data recorded at ambient temperature on the conventional laboratory diffractometer described in section 6. In each case, the unit cell was determined directly from the powder diffraction data by standard indexing procedures, and space group assignment was carried out by consideration of systematic absences. In each case, density considerations indicated that there is one molecule in the asymmetric unit. The structure solution calculations were carried out using the GA method implemented in our program EAGER,56 and Rietveld refinement was carried out using the GSAS program.64 The first example39 concerns the tetrapeptide PheGly-Gly-Phe (Figure 5). The GA structure solution calculation involved 11 variable torsion angles, with the peptide groups constrained to be planar units with the O-C-N-H torsion angle fixed at 180°. The structure (Figure 6; space group P41) comprises ribbons that run along the c-axis, with adjacent molecules in these ribbons interacting through three N-H‚‚‚OdC hydrogen bonds (Figure 6b) in a manner directly analogous to an antiparallel β-sheet. Intermolecular N-H‚‚‚OdC hydrogen bonds involving the end-groups of the oligopeptide chains give rise to two intertwined helical chains

Figure 6. (a) Crystal structure of Phe-Gly-Gly-Phe viewed along the 41 screw axis (c-axis). (b) Interactions between adjacent molecules in the crystal structure of Phe-Gly-GlyPhe (viewed perpendicular to the 41 screw axis) illustrating the formation of an antiparallel β-sheet arrangement (hydrogen atoms are omitted for clarity).

running along the 41 screw axis. Thus, all aspects of the structure are completely plausible on both chemical and structural considerations, and the structure gives very good agreement with the experimental powder X-ray diffraction data (see Figure 7).

Review

Crystal Growth & Design, Vol. 3, No. 6, 2003 893

Figure 7. Comparison of calculated and experimental powder X-ray diffraction patterns in the final Rietveld refinement calculation for Phe-Gly-Gly-Phe (experimental pattern, + marks; calculated pattern, solid line; difference between calculated and experimental patterns, lower line). Tick marks represent reflection positions.

Figure 8. (a) Molecular structure of Piv-LPro-Gly-NHMe, showing the variable torsion angles in the GA structure solution calculation. (b) Molecular structure of Piv-LPro-γAbu-NHMe, showing the variable torsion angles in the GA structure solution calculation.

The next two examples40,42 concern the characterization of β-turns, which are a type of structural element that allows polypeptide chain reversal in proteins. As part of our studies into this issue, we have applied our GA technique for structure solution of the peptides PivLPro-Gly-NHMe (Figure 8a) and Piv-LPro-γ-AbuNHMe (Figure 8b) from powder X-ray diffraction data. In the GA structure solution calculation40 for PivLPro-Gly-NHMe (Figure 8a), the genetic code comprised nine variables {θ, φ, ψ, τ1, τ2, ..., τ6} (in space group P1, the position {x, y, z} of the molecule is fixed arbitrarily). The six variable torsion angles were allowed to take any value, except τ5, which was allowed to take only the values 0° or 180°. The O-C-N-H torsion angle between τ3 and τ4 was fixed at 180°. The population comprised 100 structures, and in each generation 100 offspring (50 pairs of parents) and 20 mutants were produced. Figure 9 shows the final refined crystal structure of Piv-LPro-Gly-NHMe, in which the mol-

Figure 9. Molecular structure of Piv-LPro-Gly-NHMe in the crystal structure, with the intramolecular N-H‚‚‚OdC hydrogen bond shown as the dashed line.

ecule adopts a type II β-turn conformation stabilized by an intramolecular 4 f 1 hydrogen bond between the CdO group of the Piv residue and the methylamide N-H group (N‚‚‚O, 2.99 Å; N‚‚‚O-C, 140.6°). As shown in Figure 9, adjacent molecules in the crystal structure of Piv-LPro-Gly-NHMe are linked along the c-axis by intermolecular N-H‚‚‚OdC hydrogen bonds (N‚‚‚O, 2.87 Å; N‚‚‚O-C, 135.3°). Given the formation of a classical type II β-turn in the structure of Piv-LPro-Gly-NHMe, we were interested to explore the effect of additional CH2 units within the peptide chain, as in Piv-LPro-γ-Abu-NHMe (Figure 8b). The structure of this material has also been determined42 from powder X-ray diffraction data using our GA technique for structure solution, with each structure in the GA calculation defined by 13 variables (seven variable torsion angles). The torsion angle of the peptide bond of the LPro residue was restricted to be either 0° or 180°, and the other two amide linkages were maintained as planar units with the O-C-N-H torsion angle fixed at 180°. All other torsion angles were

894

Crystal Growth & Design, Vol. 3, No. 6, 2003

Figure 10. Molecular structure of Piv-LPro-γ-Abu-NHMe in the crystal structure, with the intramolecular C-H‚‚‚OdC interaction shown as the dashed line.

varied freely. The GA structure solution calculation involved the evolution of a population of 100 structures, with 50 mating operations (to produce 100 offspring) and 20 mutation operations carried out in each generation. In the final refined crystal structure of Piv-LProγ-Abu-NHMe, the molecule adopts a folded conformation (Figure 10) that is reminiscent of the chain reversals found in R-peptide structures. In particular, a short C-H‚‚‚OdC interaction [H‚‚‚O, 2.51 Å; C‚‚‚O, 3.59 Å; C-H‚‚‚O, 172.4°; H atom position normalized according to standard geometry from neutron diffraction] is observed between one of the methylene hydrogen atoms of the γ-Abu residue and the CdO group of the Piv residue. This C-H‚‚‚OdC interaction defines a 10-atom intramolecular cyclic motif, similar to that observed in the classical β-turn. Indeed, the observed motif directly mimics the conformation in the parent material PivLPro-Gly-NHMe discussed above, but with the formation of an intramolecular C-H‚‚‚OdC hydrogen bond rather than an intramolecular N-H‚‚‚OdC hydrogen bond. In the cyclic conformation adopted by Piv-LProγ-Abu-NHMe, all CdO groups point outward from one face of the molecule and all N-H groups point outward from the other face, leading to the formation of columns of molecules along the c-axis, with adjacent molecules linked by two intermolecular N-H‚‚‚OdC hydrogen bonds. 8. Synchrotron versus Laboratory Powder Diffraction Data We now consider the relative merits of using synchrotron X-ray powder diffraction data65-67 versus conventional laboratory powder X-ray diffraction data in the field of structure determination from powder diffraction data, recognizing that the use of synchrotron radiation generally gives powder diffraction data of higher resolution and improved signal/noise ratio. With high resolution, problems due to peak overlap can be alleviated, at least to some extent, allowing more reliability in determining accurate peak positions (which is advantageous in unit cell determination) and more reliability in the extraction of the intensities of individual diffraction maxima from the powder diffraction pattern. In this

Review

regard, synchrotron radiation can be advantageous when traditional techniques (or a direct-space technique using a figure-of-merit based on extracted peak intensities) are to be used for structure solution. Thus, the success of traditional techniques for structure solution is generally enhanced by using data recorded on an instrument with as high resolution as possible. However, for direct-space techniques employing a figure-ofmerit based on a profile R-factor (such as Rwp), the important requirement is not high resolution itself, but rather that the peak profiles are well-defined and accurately described by the peak shape and peak width functions used in the structure solution calculation. In such cases, the use of laboratory data can be just as effective as the use of synchrotron data, and the examples discussed in sections 6 and 7 demonstrate that the use of a good quality, well-optimized laboratory powder X-ray diffractometer is usually perfectly adequate for research in this field. 9. Concluding Remarks The successful application of direct-space techniques for structure determination of molecular solids from powder diffraction data has been clearly demonstrated by several examples in recent years, including those highlighted in this article. Through these techniques, the opportunity to determine molecular crystal structures directly from powder diffraction data is now a very real capability. Nevertheless, there remains considerable scope for the future development and optimization of the strategies for implementing the direct-space strategy, both in terms of the development of new and optimized procedures for searching R(Γ) hypersurfaces, and in terms of the development of new ways of defining the hypersurface such that global optimization may be achieved more efficiently. While direct-space approaches for powder structure solution are particularly appropriate in the case of molecular solids, the opportunities for applying these methods extend far beyond the molecular solid state, and the future application of these techniques promises to reveal new and important insights into structural properties of a wide range of different types of materials, for which structural characterization by single-crystal diffraction techniques is not possible. Given that the current scope of powder diffraction techniques for complete structure determination of molecular solids is comparable to the state-of-the-art in single-crystal diffraction around the 1950s or 1960s,68 it is realistic to predict that future developments in the capabilities of powder diffraction techniques for structure determination will mirror the developments that took place in the capabilities of single-crystal diffraction techniques during the latter half of the 20th Century. Acknowledgment. I am indebted to several colleagues, particularly, Dr. Roy Johnston, Dr. Eugene Cheung, Dr. Emilio Tedesco, Dr. Benson Kariuki, Dr. Maryjane Tremayne, Dr. Giles Turner, Mr. Scott Habershon, and others mentioned in the references, for their involvement in the research described in this paper. Our research in this field has been supported by EPSRC, University of Birmingham, Purdue Pharma, Ciba Specialty Chemicals, Wyeth, Procter and Gamble, and Accelrys.

Review

Crystal Growth & Design, Vol. 3, No. 6, 2003 895

References (1) Cheetham, A. K.; Wilkinson, A. P. Angew. Chem. Int. Ed. Engl. 1992, 31, 1557. (2) Harris, K. D. M.; Tremayne, M. Chem. Mater. 1996, 8, 2554. (3) Langford, J. I.; Loue¨r, D. Rep. Prog. Phys. 1996, 59, 131. (4) Poojary, D. M.; Clearfield, A. Acc. Chem. Res. 1997, 30, 414. (5) Meden, A. Croat. Chem. Acta 1998, 71, 615. (6) Harris, K. D. M.; Tremayne, M.; Kariuki, B. M. Angew. Chem. Int. Ed. 2001, 40, 1626. (7) Harris, K. D. M.; Tremayne, M.; Lightfoot, P.; Bruce, P. G. J. Am. Chem. Soc. 1994, 116, 3543. (8) Kariuki, B. M.; Serrano-Gonza´lez, H.; Johnston, R. L.; Harris, K. D. M. Chem. Phys. Lett. 1997, 280, 189. (9) Harris, K. D. M.; Johnston, R. L.; Kariuki, B. M.; Tremayne, M. J. Chem. Res. (S) 1998, 390. (10) Harris, K. D. M.; Johnston, R. L.; Kariuki, B. M. Acta Crystallogr. 1998, A54, 632. (11) Rietveld, H. M. J. Appl. Crystallogr. 1969, 2, 65. (12) Young, R. A., Ed. The Rietveld Method; International Union of Crystallography and Oxford University Press: Oxford, 1993. (13) Visser, J. W. J. Appl. Crystallogr. 1969, 2, 89. (14) Werner, P.-E.; Eriksson, L.; Westdahl, M. J. Appl. Crystallogr. 1985, 18, 367. (15) Boultif, A.; Loue¨r, D. J. Appl. Crystallogr. 1991, 24, 987. (16) Shirley, R. A. CRYSFIRE Suite of Programs for Indexing Powder Diffraction Patterns, University of Surrey. (17) Kariuki, B. M.; Belmonte, S. A.; McMahon, M. I.; Johnston, R. L.; Harris, K. D. M.; Nelmes, R. J. J. Synchrotron Radiat. 1999, 6, 87. (18) Harris, K. D. M. J. Phys. Chem. Solids 1992, 53, 529. (19) Lightfoot, P.; Tremayne, M.; Harris, K. D. M.; Bruce, P. G. J. Chem. Soc. Chem. Commun. 1992, 1012. (20) Kariuki, B. M.; Zin, D. M. S.; Tremayne, M.; Harris, K. D. M. Chem. Mater. 1996, 8, 565. (21) Tremayne, M.; Kariuki, B. M.; Harris, K. D. M. J. Mater. Chem. 1996, 6, 1601. (22) Tremayne, M.; Kariuki, B. M.; Harris, K. D. M. Angew. Chem. Int. Ed. Engl. 1997, 36, 770. (23) Tremayne, M.; MacLean, E. J.; Tang, C. C.; Glidewell, C. Acta Crystallogr. 1999, B55, 1068. (24) MacLean, E. J.; Tremayne, M.; Kariuki, B. M.; Harris, K. D. M.; Iqbal, A. F. M.; Hao, Z. J. Chem. Soc. Perkin Trans. 2 2000, 1513. (25) Miao, P.; Robinson, A. W.; Palmer, R. E.; Kariuki, B. M.; Harris, K. D. M. J. Phys. Chem. 2000, 104, 1285. (26) Tanahashi, Y.; Nakamura, H.; Yamazaki, S.; Kojima, Y.; Saito, H.; Ida, T.; Toraya, H. Acta Crystallogr. 2001, B57, 184. (27) Newsam, J. M.; Deem, M. W.; Freeman, C. M. Accuracy in Powder Diffraction II: NIST Special Publ. No. 846; NIST: Gaithersburg, MD, 1992; pp 80-91. (28) Ramprasad, D.; Pez, G. B.; Toby, B. H.; Markley, T. J.; Pearlstein, R. M. J. Am. Chem. Soc. 1995, 117, 10694. (29) Andreev, Y. G.; Lightfoot, P.; Bruce, P. G. Chem. Commun. 1996, 2169. (30) Freeman, C. M.; Gorman, A. M.; Newsam, J. M. In Computer Modelling in Inorganic Crystallography; C. R. A. Catlow, Ed., Academic Press: San Diego, 1997. (31) Andreev, Y. G.; Lightfoot, P.; Bruce, P. G. J. Appl. Crystallogr. 1997, 30, 294. (32) David, W. I. F.; Shankland, K.; Shankland, N.; Chem. Commun. 1998, 931. (33) Engel, G. E.; Wilke, S.; Ko¨nig, O.; Harris, K. D. M.; Leusen, F. J. J. J. Appl. Crystallogr. 1999, 32, 1169. (34) Pagola, S.; Stephens, P. W.; Bohle, D. S.; Kosar, A. D.; Madsen, S. K. Nature 2000, 404, 307. (35) Shankland, K.; McBride, L.; David, W. I. F.; Shankland, N.; Steele, G. J. Appl. Crystallogr. 2002, 35, 443. (36) Turner, G. W.; Tedesco, E.; Harris, K. D. M.; Johnston, R. L.; Kariuki, B. M. Chem. Phys. Lett. 2000, 321, 183. (37) Kariuki, B. M.; Calcagno, P.; Harris, K. D. M.; Philp, D.; Johnston, R. L. Angew. Chem. Int. Ed. 1999, 38, 831.

(38) Kariuki, B. M.; Psallidas, K.; Harris, K. D. M.; Johnston, R. L.; Lancaster, R. W.; Staniforth, S. E.; Cooper, S. M. Chem. Commun. 1999, 1677. (39) Tedesco, E.; Turner, G. W.; Harris, K. D. M.; Johnston, R. L.; Kariuki, B. M. Angew. Chem. Int. Ed. 2000, 39, 4488. (40) Tedesco, E.; Harris, K. D. M.; Johnston, R. L.; Turner, G. W.; Raja, K. M. P.; Balaram, P. Chem. Commun. 2001, 1460. (41) Albesa-Jove´, D.; Tedesco, E.; Harris, K. D. M.; Johnston, R. L.; Cheung, E. Y. Cryst. Growth Des. 2001, 1, 425. (42) Cheung, E. Y.; McCabe, E. E.; Harris, K. D. M.; Johnston, R. L.; Tedesco, E.; Raja, K. M. P.; Balaram, P. Angew. Chem. Int. Ed. 2002, 41, 494. (43) Shankland, K.; David, W. I. F.; Csoka, T. Z. Kristall. 1997, 212, 550. (44) Shankland, K.; David, W. I. F.; Csoka, T.; McBride, L. Int. J. Pharm. 1998, 165, 117. (45) Reck, G.; Kretschmer, R.-G.; Kutschabsky, L.; Pritzkow, W. Acta Crystallogr. 1988, A44, 417. (46) Masciocchi, N.; Bianchi, R.; Cairati, P.; Mezza, G.; Pilati, T.; Sironi, A. J. Appl. Crystallogr. 1994, 27, 426. (47) Dinnebier, R. E.; Stephens, P. W.; Carter, J. K.; Lommen, A. N.; Heiney, P. A.; McGhie, A. R.; Brard, L.; Smith III, A. B. J. Appl. Crystallogr. 1995, 28, 327. (48) Hammond, R. B.; Roberts, K. J.; Docherty, R.; Edmondson, M. J. Phys. Chem. 1997, B101, 6532. (49) Chernyshev, V. V.; Schenk, H. Z. Kristallogr. 1998, 213, 1. (50) Seaton, C. C.; Tremayne, M. Chem. Commun. 2002, 880. (51) Cernik, R. J.; Cheetham, A. K.; Prout, C. K.; Watkin, D. J.; Wilkinson, A. P.; Willis, B. T. M. J. Appl. Crystallogr. 1991, 24, 222. (52) Pawley, G. S. J. Appl. Crystallogr. 1981, 14, 357. (53) Metropolis, N.; Rosenbluth, A. W.; Rosenbluth, M. N.; Teller, A. H.; Teller, E. J. Chem. Phys. 1953, 21, 1087. (54) van Laarhoven, P. J. M.; Aarts, E. H. L. Simulated Annealing: Theory and Applications; Riedel Publishing, D.: Holland, 1987. (55) Harris, K. D. M.; Johnston, R. L.; Kariuki, B. M. In Evolutionary Algorithms in Computer-Aided Molecular Design; Clark, D. E., Ed., Wiley-VCH: New York, 2000; pp 159-194, Chapter 9. (56) Harris, K. D. M.; Johnston, R. L.; Albesa Jove´, D.; Chao, M. H.; Cheung, E. Y.; Habershon, S.; Kariuki, B. M.; Lanning, O. J.; Tedesco, E.; Turner, G. W.; EAGER, University of Birmingham, 2001 (an extended version of the program GAPSS, Harris, K. D. M.; Johnston, R. L.; Kariuki, B. M.; University of Birmingham, 1997). (57) Turner, G. W.; Tedesco, E.; Harris, K. D. M.; Johnston, R. L.; Kariuki, B. M. Z. Kristallogr. 2001, 216, 187. (58) Lanning, O. J.; Habershon, S.; Harris, K. D. M.; Johnston, R. L.; Kariuki, B. M.; Tedesco, E.; Turner, G. W. Chem. Phys. Lett. 2000, 317, 297. (59) Putz, H.; Scho¨n, J. C.; Jansen, M. J. Appl. Crystallogr. 1999, 32, 864. (60) Brodski, V.; Peschar, R.; Schenk, H. J. Appl. Crystallogr. 2003, 36, 239. (61) Lehman, M. S.; Koetzle, T. F.; Hamilton, W. C. J. Cryst. Mol. Struct. 1972, 2, 225. (62) Lehman, M. S.; Nunes, A. C. Acta Crystallogr. 1980, B36, 1621. (63) Kariuki, B. M.; Johnston, R. L.; Harris, K. D. M.; Psallidas, K.; Ahn, S.; Serrano-Gonza´lez, H. Comm. Math. Comput. Chem. 1998, 38, 123. (64) Larson, A. C.; Von Dreele, R. B. Los Alamos Lab. Report No. LA-UR-86-748, 1987. (65) Catlow, C. R. A., Greaves, G. N., Eds. Applications of Synchrotron Radiation; Blackie: Glasgow, 1990. (66) Maginn, S. J. Analyst 1998, 123, 19R. (67) Margaritondo, G. Elements of Synchrotron Light; Oxford University Press: Oxford, 2002. (68) Robertson, J. M. Organic Crystals and Molecules; Cornell University Press: Ithaca, NY, 1953.

CG030018W