Effect of Intermolecular Hydrogen-Bonded Motifs on Packing Pattern Populations Elna Pidcock* and W. D. Sam Motherwell Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, England CB6 1AQ
CRYSTAL GROWTH & DESIGN 2005 VOL. 5, NO. 6 2322-2330
Received March 18, 2005
ABSTRACT: The Box Model of crystal packing has shown that unit cells with low “external” surface area for a given volume are more prevalent in the Cambridge Structural Database than structures with unit cells of a large surface area. However, high-surface-area cells exist, and a possible explanation is the presence of strong, structuredirecting intermolecular interactions. A study of structures belonging to P21/c has been undertaken in an attempt to understand the role of intermolecular interactions within the context of the Box Model. From an examination of the distribution of randomly selected structures over the packing patterns, it is shown that the energetic interaction which relates the largest faces of the molecules is important. When a strong, symmetry-demanding intermolecular motif is present in the structure, for example a carboxylic acid dimer motif or a CONH chain motif, the distribution of structures over the packing patterns changes. These changes can be rationalized in terms of the symmetry of interactions between the groups required to build the motif and the orientation of the group with respect to the large face of the molecule. However, the presence of hydrogen-bonded motifs does not appear to greatly perturb the principal tenet of the Box Model: minimum surface area is preferred. Introduction Kitaigirodskii1
showed, in his seminal work of the 1950s, that a simple model of molecular crystal packing, where the distribution of structures over the space groups could be rationalized in terms of how well the symmetry operators of the space groups fit the molecules together, is a powerful tool for the understanding of crystal structures. To predict molecular crystal structures a priori and to gain control over the solidstate structure by manipulation at the molecular level are two challenges that face the scientific community today. Our contribution is to utilize the hundreds of thousands of experimental structures that exist today to look once again for simple rules that may be used to further our understanding of crystal structures (the Cambridge Structural Database2 (CSD) currently contains >330 000 structures). Unlike the work of Kitaigorodskii, which described crystal structures as layers of close-packed sheets, our work has concentrated on the unit cell as the building block of the crystal structure. Examination of the possible ways of closepacking a limited number of three-dimensional objects has led to the proposal of a new model of crystal packing, the Box Model.3 In summary, a box has three unequal dimensions, L, M, and S, where L > M > S. For a given number of boxes there is a limited number of ways of stacking the boxes with faces touching and edges aligned. For example, for two boxes there are three possible arrangements: the large, medium, or small faces of the boxes may be placed in contact with each other. The overall size of the arrays of boxes can be described in terms of the dimensions of the boxes. Thus, an array of two boxes is two boxes high, one box wide, and one box deep. If the large faces of the two boxes are in contact with each other, then the dimensions of the array are described by 1 × L, 1 × M, and 2 × S (see * To whom correspondence should be addressed. E-mail: pidcock@ ccdc.cam.ac.uk.
Figure 1, left). There are six possible arrangements, or packing patterns, for four boxes, and these are shown in Figure 2. When molecules were described in terms of three dimensions, L, M, and S, it was shown that the packing patterns of stacked boxes were a good approximation to the spatial arrangement of molecules in a unit cell.3 Analogous to the Box Model, unit cell dimensions can be described in terms of multiples of molecular dimensions (Figure 1, middle). A detailed analysis of the distributions of the ratios of cell dimension to molecular dimension for thousands of structures has led to the parametrization of close packing (Figure 1, bottom).4 The parameters that describe the spacing of the molecules (in terms of molecular dimensions) are remarkably similar to one another, even for structures with unit cells of different Z and which belong to different space groups. A property of the packing patterns is that, for a given number of boxes, the arrangements of boxes have the same volume but different external surface areas. It was observed that crystal structures described by the lowsurface-area packing patterns were present to a greater extent than structures characterized by a high “exterior” surface area.3 These findings have led us to reiterate one of Kitaigorodskii’s assertions, namely that molecular shape is of primary importance in crystal packing. However, all packing patterns of the Box Model are populated by structures found in the CSD, even those cells which are very extreme in shape and which have a very high surface area for given volume. What is the explanation for the occurrence of structures described by high-surface-area packing patterns which go against the general trend? Recognizable and energetically significant intermolecular motifs, notably hydrogen bonding, exist in crystal structures and have been exploited in rational design, known as crystal engineering. Molecular shape, therefore, is clearly not the only “structuredirecting” factor. In this paper we examine the effect of
10.1021/cg050099r CCC: $30.25 © 2005 American Chemical Society Published on Web 07/28/2005
Packing Pattern Populations
Crystal Growth & Design, Vol. 5, No. 6, 2005 2323
Figure 2. Illustration of the six possible packing patterns for four boxes together with the packing pattern names. These packing patterns serve as models for the contents of unit cells that contain four molecules. The II packing patterns are those where the large faces of the boxes are in contact with each other within the arrangement, and the TT packing patterns are those where the largest faces of the boxes are not in contact.
Figure 1. Introduction to the Box Model of crystal packing. An object, molecule, or box is described by three dimensions, L, M, and S where L > M > S. For a given number of objects there is a limited number of ways of packing the objects with faces touching and edges aligned. Illustrated is an example where two boxes are stacked with the largest faces in contact. Thus, the overall dimensions of the arrangement are given by 1L × 1M × 2S (112S). In analogy to the Box Model, unit cell dimensions can be described in terms of molecular dimensions. Systematic relationships, explained by the Box Model, have been found between cell lengths and molecular dimensions for thousands of crystal structures.3 A histogram of ratios of cell length to molecular dimension calculated for Z ) 2 structures belonging to P21 is shown.4
strong, symmetry-demanding intermolecular motifs on the distribution of structures over the packing patterns of the Box Model. Methodology A data set of 12 426 structures where Z ) 4, with space group setting P21/c, was extracted from the Cambridge Structural Database using ConQuest.5 Structures containing molecules of more than one chemical type or with more than one molecule in the asymmetric unit were excluded. This data set of structures was used previously as the basis of the investigation of the Box Model of crystal packing, and hence, all structures were assigned to a packing pattern.3 To reduce the time required for calculations, a dataset of 2376 structures was generated by randomly selecting entries from the original dataset, hereafter the Control dataset. The structures of the Control dataset were processed using the RPluto program,6 as follows. The positions of any missing hydrogen atoms were calculated using the ideal geometry about carbon and nitrogen atoms. The crystal-packing potential energy was calculated by summing molecular interactions, about a central reference molecule, using the atom-pair empirical potentials of Gavezzotti.7 The elements included in the calculation of intermo-
lecular potentials were C, H, N, O, S, Cl, and F; Br was approximated by S, and all other elements made no contribution. The Gavezzotti potentials that describe common hydrogenbonding interactions were used, and approximations were made for potentials of hydrogen bonds not included by Gavezzotti. Structures containing molecules of more than 200 atoms were omitted. Each structure gives a list of intermolecular interactions sorted into order by energy, strongest first, together with the symmetry operators which mediate the interactions. Thus, the symmetries of the strongest energetic interactions are easily obtainable from the output of these calculations. It is appreciated that these calculations are crude. However, the calculations are used only to identify the strongest two interactions in the structure. It is unlikely that the use of more sophisticated calculations would change the identification of the strongest two interactions in many cases.8 Two further data sets were generated from the original P21/c data set using ConQuest. The COOH-dimer data set was composed of 241 structures containing a carboxylic acid dimer motif. The identification of structures forming a carboxylic acid dimer was performed using ConQuest by defining 2 O-O distances of 2-3.5 Å, and the COOH groups were required to be planar (within 0-10°). Other hydrogen-bonding acceptors were allowed, and no restriction was placed on the number of COOH groups present in the molecule. The CONH-chain dataset (92 structures) was composed of structures containing a trans C(dO)NH group with an intermolecular O- - -N distance defined as 2.0-3.5 Å. Both of these “motif” datasets were then processed by RPluto to calculate the interaction energies and the symmetry of the interactions, as before. A brief summary of the Box Model nomenclature follows. The object, box or molecule, is characterized by the three unequal dimensions L, M, and S, where L > M > S.3 For a given number of boxes there is a limited number of packing patterns, or ways that the boxes can be stacked: faces touching and edges aligned. These arrangements are described in the packing pattern name. Thus, packing pattern 221L for example indicates an array of four boxes (2 × 2 × 1 ) 4), the dimensions of which correspond to 2M × 2S × 1L; the unique direction of the packing pattern, in this case 1L, is given last in the packing pattern name. Thus, 114S describes a pattern of four boxes (1 × 1 × 4 ) 4) of dimensions 1L × 1M × 4S. In this paper we only deal with unit cells containing four
2324
Crystal Growth & Design, Vol. 5, No. 6, 2005
Pidcock and Motherwell
Table 1. Structures Belonging to the Control Data Set in P21/c with Z ) 4a packing pattern
II
population of symmetry subset (%) IS IT GG SS
IG
TT
other
total (% of entire data set)
221L 221M 221S 114L 114M 114S
25.8 29.0 18.2 21.4 17.3 27.6
14.8 11.3 9.2 7.1 6.2 22.2
11.2 12.0 6.0 0.0 8.6 15.8
3.4 5.0 6.6 14.2 12.3 10.0
23.1 22.7 13.8 10.7 11.1 9.5
17.3 14.3 14.4 7.1 12.3 3.6
3.7 5.1 31.7 39.3 32.1 11.3
0.7 0.5 0.2 0.0 0.0 0.0
980 (41.2) 565 (23.8) 501 (21.1) 28 (1.2) 81 (3.4) 221 (9.3)
total (% of entire data set)
589 (24.8)
311 (13.1)
250 (10.5)
130 (5.5)
456 (19.2)
345 (12.0)
286 (12.0)
11 (0.4)
2376
a The population of each packing pattern has been divided on the basis of the symmetry operators which mediate the top two energetic interactions. The percentages of structures within each packing pattern belonging to the symmetry operator subsets are given. A more detailed table is given in the Supporting Information.
molecules and, hence, there are six possible packing patterns; 221L, 221M, 221S, 114L, 114M, and 114S. An illustration of these packing patterns is given in Figure 2. For a molecule with three dimensions L, M, and S where L > M > S, the largest faces of the molecule are the faces described by L × M and are situated perpendicular to the S (shortest) dimension of the molecule.
Results and Discussion Intermolecular Interactions. We begin with the Control data set. The distribution of structures across the packing patterns in the Control data set is in good agreement with the distribution observed from the parent data set of 12 426 structures3 and thus is a good, representative sample. For each of the structures in the Control data set, the intermolecular interactions were evaluated using RPluto, as above. The energy calculations are crude (but quick) and provide a satisfactory description of the strongest interactions between molecules. The strongest two intermolecular interactions and the symmetry operator that mediated each of these interactions were recorded. The symmetries of the operators mediating the interactions are used to label the subsets: I represents the inversion operator, G the glide plane, S the screw axis and T a translation of one unit cell. Thus, each structure is assigned to a packing pattern of the Box Model, and a symmetry subset describing the strongest interactions. The breakdown of structures over the packing patterns and the symmetry subsets is given in Table 1. The packing patterns encapsulate information regarding the spatial arrangement of molecules within the cell. It can be seen from the models illustrated in Figure 2 that, for a structure assigned to packing pattern 221L, the L × M and L × S faces of the molecule are in contact with each other, within the unit cell (see Methodology for a description of the packing pattern nomenclature). For packing pattern 221S, however, it is the LS and MS faces which are in contact with each other (Figure 2). Thus, the different packing patterns express the different possible relationships between the faces of the molecules within the unit cell. By examining the energetically strong interactions of structures assigned to different packing patterns, it is possible to probe the role of molecular shape in crystal packing. It is appreciated that a crystal structure cannot be described by the two strongest interactions alone, but it is shown below that there are rational relationships between these and the packing patterns.
Examination of Table 1 shows the relative importance of the symmetry operators with regards to the strongest energetic interactions. Inversion, I, mediates the strongest interaction between molecules in 54% of the structures belonging to the Control data set. The glide plane, G, is the second most popular mediator of the strongest energetic interaction (19%), followed by the screw axis, S (14%), and translation, T (12%). The largest proportion of structures, for each packing pattern, is found in either the II symmetry subset or the TT symmetry subset. Packing patterns 221L, 221M, and 114S favor II, and packing patterns 221S, 114L, and 114M favor TT. The division of the packing patterns into two broad categories can be rationalized in terms of the interactions between the large faces (the L × M faces) of the molecules. For example, the packing patterns 221L, 221M, and 114S, all members of the II category, correspond to arrays of boxes where the largest faces of the boxes are placed in contact with each other (Figure 2, top). Therefore, in a unit cell described by one of these packing patterns, the largest faces of the molecules are related to each other by symmetry operators and not by translations of the unit cell (for an example, see Figure 3, top). In the case of structures belonging to packing pattern 221L, translations of the unit cell relate the smallest faces of the molecules. Only a small proportion of structures assigned to the 221L packing pattern (