A Million Crystal Structures: The Whole Is Greater than the Sum of Its

Jun 17, 2019 - The founding in 1965 of what is now called the Cambridge Structural Database (CSD) has reaped dividends in numerous and diverse areas o...
0 downloads 0 Views 16MB Size
Review Cite This: Chem. Rev. XXXX, XXX, XXX−XXX

pubs.acs.org/CR

A Million Crystal Structures: The Whole Is Greater than the Sum of Its Parts Robin Taylor and Peter A. Wood* Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, United Kingdom

Downloaded via KEAN UNIV on July 17, 2019 at 20:20:54 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

S Supporting Information *

ABSTRACT: The founding in 1965 of what is now called the Cambridge Structural Database (CSD) has reaped dividends in numerous and diverse areas of chemical research. Each of the million or so crystal structures in the database was solved for its own particular reason, but collected together, the structures can be reused to address a multitude of new problems. In this Review, which is focused mainly on the last 10 years, we chronicle the contribution of the CSD to research into molecular geometries, molecular interactions, and molecular assemblies and demonstrate its value in the design of biologically active molecules and the solid forms in which they are delivered. Its potential in other commercially relevant areas is described, including gas storage and delivery, thin films, and (opto)electronics. The CSD also aids the solution of new crystal structures. Because no scientific instrument is without shortcomings, the limitations of CSD research are assessed. We emphasize the importance of maintaining database quality: notwithstanding the arrival of big data and machine learning, it remains perilous to ignore the principle of garbage in, garbage out. Finally, we explain why the CSD must evolve with the world around it to ensure it remains fit for purpose in the years ahead.

CONTENTS 1. Introduction 2. Fundamental Science 2.1. Molecular Geometry and Structure 2.1.1. Atomic Radii 2.1.2. Conformational Analysis 2.1.3. Standard Geometries and Geometry Libraries 2.1.4. Crystal Packing Effects on Molecular Conformations 2.1.5. Metal Coordination 2.1.6. Crystallization Propensity 2.1.7. Tautomerism and Proton Transfer 2.2. Intermolecular Interactions 2.2.1. Hydrogen Bonds 2.2.2. σ-Hole Interactions 2.2.3. Dipole−Dipole and Orthogonal Multipolar (π-Hole) Interactions 2.2.4. Aromatic Interactions 2.2.5. All That Glisters Is Not Gold 2.3. The Systematics of Crystalline Assemblies 2.3.1. Motifs and Synthons 2.3.2. Crystal-Structure Architectures 2.3.3. Symmetry and Chirality 2.3.4. Z′ 2.3.5. Polymorphism 2.3.6. Cocrystals 2.3.7. Hydrates and Solvates 3. Design of Biologically-Active Molecules 3.1. Molecular Shapes 3.2. Molecular Recognition © XXXX American Chemical Society

3.3. The CSD as a Diverse Chemical Database 4. Emerging Applications 4.1. Challenges in Drug Development 4.1.1. Interactions and Packing 4.1.2. Risk of Polymorphism 4.1.3. Cocrystal Design 4.1.4. Morphology and Other Physical Properties 4.2. Other Industrially Relevant Applications 4.2.1. Energetic Materials 4.2.2. Paints, Pigments, and Dyes 4.2.3. Organic Semiconductors 4.2.4. Nonlinear Optical Materials 4.2.5. Ferroelectricity 4.2.6. Magnetic Anisotropy and Single-Molecule Magnets 4.2.7. Catalysts 4.2.8. Gas Storage and Separation 4.2.9. Thin Films and Coatings 4.2.10. Solar Thermal Fuels 4.3. Structure Solution 4.3.1. Macromolecular Crystal Structure Determination 4.3.2. Structure Determination from Powder Diffraction Data 5. Lessons from the Past and Prospects for the Future 5.1. Limitations of CSD-Based Research

B C C C D D E E G G G G H K M O O O P Q Q R S S T T V

X Y Y Y Z AA AB AC AC AC AD AD AD AD AD AD AF AF AF AF AG AG AG

Received: March 7, 2019

A

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

5.1.1. Chemical Content of the CSD 5.1.2. Problems with Crystal Structures 5.1.3. Accuracy, Presentation, and Extent of CSD Data 5.1.4. The Relevance of CSD Data 5.1.5. Problems Endemic to Data Analysis 5.2. The Keys to Success 5.3. Future Challenges 5.3.1. The Advent of Big Data 5.3.2. The Continued Evolution of Crystallography 5.3.3. The Continued Evolution of Chemistry 6. Concluding Remarks Associated Content Supporting Information Author Information Corresponding Author ORCID Notes Biographies Acknowledgments References

ments”.3 This seems obvious now, as things often do in retrospect. In fact, there were four very good reasons why, at the time, it was not obvious at all. First, there were few if any precedents. Older scientific databases may exist, but we do not know of any that were intended from the start to become research tools in their own right. Second, collecting all published structures would require the cooperation of many people across the globe. Such a thing is never to be taken for granted and must have looked particularly daunting at the height of the Cold War. Third, computers were in short supply in 1965 and by today’s standards ludicrously slow. Moore’s Law4 was only published that year and had yet to gain credibility. The Internet and World Wide Web must have been almost beyond imagination (we say “almost”some claim that Mark Twain predicted it in 18985). So it was by no means evident that computerized databases could become hugely important in science. The final reason was that solving crystal structures was not easy in 1965, so it was uncertain how large and therefore how useful the CSD could become. The intensities of X-ray reflections were often estimated by eye from photographs taken on Weissenberg cameras. Direct methods for solving the phase problem were still in their infancy. Structure solution involved models with plastic balls and metal sticks and usually took months, with complete failures not uncommon. Bernal and Kennard would have expected the methodology to improve, but the extent to which it has done so is breathtaking. With the current generation of equipment, a structural chemistry research group can now have their own benchtop X-ray diffractometer capable of determining the crystal structure of a small molecule in just a few hours. The first signs that the founders’ vision would be fulfilled came in the late 1970s and early 1980s, when a few workers began to demonstrate that interesting results could be obtained by using the CSD as a research tool; some of these pioneering papers are mentioned later. Since then, there has been an everincreasing number of scientists using the CSD for an everexpanding variety of applications. It is this research that we review here. It has not been comprehensively surveyed for many years,6−8 and a great number of relevant papers have been published since then. To reduce our task somewhat, we focus on the last 10 years, though earlier papers are frequently mentioned to provide context. We begin with studies aimed at clarifying fundamental issues: molecular structure and geometry, intermolecular interactions, and molecular assemblies (the systematics, symmetries, and topologies of crystal structures). While much of this is relatively basic research, it is still being actively pursued by numerous research groups and generates invaluable foundations for others to build on. Many of the older CSDbased papers on these topics have become citation classics, and we have no doubt that the same will prove true of numerous recent publications. We then move on to the leading industrial application of the CSD, its use to aid the discovery of biologically active molecules and, in particular, pharmaceuticals. In this context, it primarily serves as a guide to conformational preferences and intermolecular interactions. Increasingly, however, data derived from the CSD are being used to drive other software applications, e.g. for scaffold-hopping, conformer generation. We then cover emerging applications. The most mature of these is the use of the CSD in drug development, particularly formulation. Ever since the disastrous “disappearing poly-

AG AG AG AH AH AH AI AI AI AJ AJ AJ AJ AJ AJ AJ AJ AJ AK AK

1. INTRODUCTION Long before there were people on the earth, crystals were already growing in the earth’s crust. On one day or another, a human being first came across such a sparkling morsel of regularity lying on the ground or hit one with his stone tool and it broke off and fell at his feet, and he picked it up and regarded it in his open hand, and he was amazed. M. C. Escher

This Review will be published at about the time that the millionth structure is added to the Cambridge Structural Database (CSD).1 The CSD is the definitive collection of published small-molecule organic and metal−organic crystal structures and was founded in 1965 at the instigation of the famous physicist J. D. Bernal and his collaborator Olga Kennard. In the first decade or so of its existence, Kennard’s small group (the embryonic Cambridge Crystallographic Data Centre, CCDC) developed basic infrastructure for maintaining the CSD, including protocols for acquiring, checking, and storing crystal structure data and detecting duplicates. A lot of keyboarding of scarcely legible deposited material was involved. There was a backlog of structures to be processed, going back to the earliest X-ray determinations of carboncontaining compounds. In addition, the appearance of new structures had to be monitored so they could be added too. Each year’s input to the CSD was summarized in book form. These first few years of the enterprise were therefore busy and filled with essential work, but they had little impact on the outside world. (The books were appreciated. A leading crystallographer reviewed one of them and said it was good for propping doors open and pressing wild flowers.2 But his tongue was firmly in his cheek. He happened at the time to be the boss of one of the authors of this Review, so we can say from personal observation that he went through each new book religiously, looking for interesting new structures.) Bernal and Kennard had shared a vision, as Kennard explained many years later: “We had a passionate belief that the collective use of data would lead to the discovery of new knowledge which transcends the results of individual experiB

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

morph” of ritonavir,9 pharmaceutical companies have been greatly interested in the crystal forms their drugs adopt, or might adopt. Is the known crystal structure of a development candidate its most stable polymorph? Could cocrystallization with a pharmaceutically acceptable additive address formulation problems? The use of the CSD to help investigate these and related issues has grown significantly over the past decade or so. In addition, recent papers report the use of the CSD for research aimed at designing commercially important materials such as dyes, energetic materials, crystalline porous materials, and organic semiconductors. Also, and in a nice quid pro quo, the CSD, built from solved crystal structures, is becoming increasingly important in aiding the solution of new structures. On the one hand, there is growing use of CSD data to restrain or validate ligand geometries in protein−ligand crystal structure analysis. On the other, crystal structure solution from powder diffraction data is becoming increasingly viable, is potentially of enormous value as an analytical technique, and is greatly assisted by using CSD information. There are always downsides. Quite recently, one author summarized CSD research thus: “That purely statistical evaluations with data bases such as the CSD can be misleading is obvious”.10 Of course, the same could be said about a great many other research techniques that are nevertheless invaluable. It is important, however, to be open about the deficiencies of the technique we are espousing. Therefore, the latter part of our Review contains a discussion of the limitations of CSD-based research in particular and of database analysis in general. Conversely, we highlight key reasons why the CSD has been a success. These considerations are not merely of parochial interest. At a time when scientific “big data” and machine-learning are increasingly advocated, the quality of scientific data is more important than ever, and lessons can be learned from one of the oldest scientific databases around. We also consider the challenges that must be surmounted to keep the CSD fit for purpose in the everevolving world of chemical research and with the looming onset of big data. Perhaps to some, scientific databases appear boring and mundane. We aim to show that they are, in fact, a new generation of scientific instruments. The collective use of data does indeed lead to the discovery of new knowledge which transcends the results of individual experiments. Or as Aristotle more or less said, the whole is greater than the sum of its parts.

a new set of vdw radii using a hugely greater number of structures taken from the CSD.12 A radius was assigned to each of the naturally occurring elements. For a given element, X, the value was determined from the crystallographic distribution of X···Y interatomic distances, where Y was a probe atom, usually oxygen. The distribution included vdw contacts and random, noninteraction X,Y pairs, often separated by long distances. Alvarez isolated the region of the distribution that corresponded to vdw contacts (illustrated in Figure 1 for the

Figure 1. Distribution of Os···O distances. The bonded pairs are in black, and the intermolecular contacts are in light blue (fitted by the blue line). The latter are deconvoluted into random pairs, increasing with distance cubed (dashed line), and vdw contacts (red line). Figure prepared for us by Professor Santiago Alvarez, author of ref 12, to whom we are very grateful.

example pair Os,O) and determined the distance at which this subdistribution reached half its maximum height. The halfheight distance was deemed to be where X and Y were in vdw contact, i.e. equal to the sum of their vdw radii. This definition was taken from an earlier study by Rowland and Taylor (R&T), who chose half-height distance for the pragmatic reason that it was the point on a vdw distribution that can most precisely be determined.13 It was not, of course, the definition used by Bondi, but the agreement between the Bondi, Alvarez, and R&T radii is surprisingly good. While undoubtedly useful, the vdw radius not only has no universally accepted definition but also is based on assumptions that do not stand up to close inspection. One example is the assumption of perfect sphericity. Analysis of CSD-derived contact distances showed long ago that this is untrue for many terminal atoms (e.g., Cl, Br, I, S, and Se), which tend to be smaller along the extension of the covalent bond.14 The extent of this flattening was redetermined recently for several elements.15 It is a hot topic because the effect of any anisotropy of vdw shapes is convoluted with close atomic approaches due to “σ-hole” interactions (section 2.2.2). Another invalid assumption is that the radius of element X is the same in X···Y and X···Z contacts, where Y and Z are different elements. That this is only an approximation is shown by the vdw radius of hydrogen, which was determined as 1.20 Å by Bondi and Alvarez but only 1.10 Å by R&T. The reason is simple: the R&T value was determined from several different types of H···Y distributions (Y = H, C, F, etc.), whereas the Bondi and Alvarez values were determined exclusively from

2. FUNDAMENTAL SCIENCE 2.1. Molecular Geometry and Structure

Because crystallography is the definitive method for determining molecular geometry and structure, it is no surprise that many CSD-based research studies have been focused on these topics. They include investigations into atomic radii, conformational preferences, metal coordination, crystallization propensity, and tautomeric preferences. 2.1.1. Atomic Radii. It may be simplistic to regard atoms as having radii, but Bondi’s 1964 publication on van der Waals (vdw) radii11 has been cited over 15 000 times. His radii were primarily based on intermolecular contact distances in a handful of crystal structures and were intended to enable calculation of molecular volumes. Now they are used for a multitude of purposes, including analysis of crystal packing and protein−ligand binding. Almost 50 years later, Alvarez derived C

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

substituted by π-acceptors.29 Conjugation can occur between the two moieties and, if steric factors permit, the substituent adopts a cis- or trans-bisected conformation; carbonyl acceptors prefer the former (i.e., with the oxygen sitting “over” the ring) and vinyl the latter. The conjugation causes the ring bond distal to the substituent to shorten and the vicinal ones to lengthen. In contrast, the reverse happens if the ring is substituted with a σ-acceptor (e.g., halogen).30 (d) Comparison of Conformations in Dif ferent Environments. Raghavender compared the backbone conformations of amino acid residues in (a) small peptides bound to proteins in Protein Data Bank (PDB)31 structures and (b) unbound peptides in CSD structures.32 Aliphatic residues (Ala, Ile, Leu, and Val) occurred often enough in both to allow meaningful comparison and were found to show broadly similar geometric trends, but with the CSD residues being somewhat more conformationally variable. One reason may be that many of the CSD peptides are cyclic, with the ring-closure constraints forcing rotatable bonds into particular geometries. It was argued long ago that multivariate statistical and pattern recognition techniques (e.g., factor analysis, multidimensional scaling) are helpful for performing conformational analysis with the CSD.33,34 However, they were not widely adopted. More recently, Parkin et al. resurrected the idea, illustrating how these techniques can provide conformational insights in an objective manner.35,36 The approach may finally become established when the CSD is used in large big-data projects (section 5.3.1). Interestingly, Parkin et al. used the Boltzmann distribution to infer conformational energy differences from the relative frequencies of conformers in the CSD.36 This was shown to be theoretically invalid a long time ago37 but may sometimes be a practicable approximation. 2.1.3. Standard Geometries and Geometry Libraries. Tabulations of CSD-derived average bond-lengths and -angles were compiled over 25 years ago and are heavily used.38−40 This type of work is still performed. For example, mean distances of covalent bonds to hydrogen were evaluated in 2010 from CSD neutron-diffraction structures.41 Arnautova et al. derived standard geometries for hexapyranoses as part of a project aimed at the simulation of glycan systems.42 Unfortunately, printed tables and ad hoc residue geometries are often insufficient for modern research, which needs comprehensive, continually updated, and computer-searchable geometry libraries. A step in this direction was the development by CCDC of Mogul,43 which can be used for rapid retrieval of CSD-derived bond-length, bond-angle, and torsionangle distributions. The ability to retrieve simple (unfused, unbridged) ring geometries was added later.44 Mogul works by describing the substructural environment of a molecular feature (e.g., a rotatable bond) by a set of keys and then searching a key-indexed library of preprepared distributions. If the distribution retrieved for a particular feature contains too few observations, related distributions are retrieved and pooled together. This latter step works well but can occasionally be slow. Distributions retrieved from Mogul are commonly used in discussions of new crystal structures. However, they have several other applications, including (a) setting up refinement restraints for protein−ligand crystal structures (section 4.3.1),45,46 (b) validation of ligand geometries (section 4.3.1),47,48 (c) aiding the solution of three-dimensional (3D) structures by powder diffraction (section 4.3.2),49,50 (d) drug discovery (section 3.1),51 (e) crystal structure prediction,52 (f)

H···H (or D···D) contact distances. Covalently bonded hydrogen atoms usually carry a small net positive charge, so H···H contacts are likely to be slightly lengthened by electrostatic repulsion. The opposite will occur in, for example, H···C contacts. Hence, R&T got a smaller value. A number of other relevant publications have appeared. Hu et al. wrote a very helpful comparison of the different sets of vdw radii that have been published, some determined from crystallographic data, some from other sources.16 A study of the distributions of intramolecular nonbonded contact distances showed that, for most element pairs, the first percentile is well estimated by the sum of Bondi vdw radii minus 0.5 Å.17 Hirshfeld analysis of crystal structures determined at high pressure showed that H···H contacts do not appear to compress below 1.7 Å.18 At ambient pressure, about 1.8% of H···H contacts are shorter than 2.0 Å. Cordero et al. determined a new set of covalent atomic radii by analysis of bond lengths in the CSD.19 Their results showed clear and smooth periodicity, with the largest element in each period being the alkaline metal and the smallest the halogen and noble gas. Most of the shrinkage occurs from group 1 to group 13. 2.1.2. Conformational Analysis. Using the CSD for conformational analysis is an attractive alternative or adjunct to popular theoretical methods such as density functional theory (DFT). The CSD has the advantage that it provides unequivocal evidence of observed conformations in a condensed phase. It can also confirm theoretically predicted relationships between conformations and bond lengths and angles. Here are a few illustrative examples: (a) Ring Geometries. Khorasani et al. showed that singly substituted 12-membered cycloalkanes are surprisingly inflexible.20 With only one possible outlier, all of the rings examined adopted a square-like conformation with D 2 symmetry. Pérez et al. investigated the conformations of the 8-membered ring in the [M(μ-OPO)]2 core of complexes in which transition metals (M) are double bridged by phosphate and related groups.21 The large number of CSD structures containing this fragment made it possible to reach detailed conclusions that we suspect would have been difficult to obtain in any other way. Claramunt et al. determined the degree of nonplanarity of the 7-membered ring of 1,5-benzodiazepine derivatives as a function of the substitution and protonation pattern.22 Even the mundane benzene ring has attracted attention recently. The influence of substituents on the ring’s degree of aromaticity was estimated by the extent to which the CC bond lengths differ from the value expected for perfect aromaticity. The highest reduction in aromaticity was found in meta-diamino and -dinitro benzene derivatives.23 (b) Conformation-Directing Interactions. The role of intramolecular C−H···π interactions in stabilizing gauche alkylaromatic bonds and axial alkylcyclohexanone conformations was inferred from theoretical and CSD studies.24,25 So too was the strong influence of intramolecular S···O interactions on the conformations of the carboxamides of sulfur-containing heterocycles.26 Of course, the most important conformationdirecting interaction is the hydrogen bond (henceforth “Hbond”). Galek et al. found that over 95% of intramolecular Hbonded rings contain 5, 6, or 7 atoms.27 An exception is the preponderance of 8-membered rings when the intramolecular H-bond is of the type N−H···OS.28 (c) Relationships between Conformations and Bond Lengths. DFT and CSD analyses were performed on cyclopropane rings D

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

assignment of tautomeric forms,53 and (g) optimization of molecular geometries. The latter is done by converting Mogul distributions into smooth, differentiable probability density functions using kernel density estimation (Figure 2).54,55

Thompson and Day estimated the strain energies of molecules in crystal structures using dispersion-corrected DFT (DFT-D).63 For each of 36 molecular geometries from the CSD, they calculated the difference in energy between the observed geometry and the nearest local minimum, and between that local minimum and the global minimum. The results indicated that these differences could be remarkably high, the largest value of each being over 20 kJ mol−1. However, when the ostensibly strained, observed geometry of DADNUR was compared with the global minimum, it was noted that the former was extended and the latter folded (Figure 3; here and elsewhere, structures are referred to by the

Figure 2. Mogul distribution for labeled torsion angle, fitted probability density function (solid line), and derived objective function for use in geometry optimization (dashed line); y-axis dimensionless. Adapted from ref 55. Copyright 2016 American Chemical Society.

Figure 3. Geometries of DADNUR: (a) gas-phase lowest energy and (b) crystallographically observed. Reprinted from ref 63 under Creative Commons License (https://creativecommons.org/licenses/ by/3.0/). Published by Royal Society of Chemistry.

An improved version of Mogul was developed for use in conformer generation.56,57 It is faster and can produce templates of low-energy geometries of simple, fused, or bridged-ring systems (i.e., models of the ring systems in favorable geometries), and each of its torsion distributions respects any symmetry or chirality in the substructural environment of the rotatable bond. Other research groups have also created torsion libraries for conformer generation, based either on the CSD alone or both the CSD and PDB. Manually defined substructures were used to generate the torsion angle distributions in one of the libraries.58,59 In contrast, the libraries developed by Sadowski and Boström60 and Kothiwale et al.61 were, like Mogul, generated algorithmically. The manual approach makes use of chemical know-how but necessarily produces smaller and less comprehensive libraries than those produced automatically. The library of Kothiwale et al. places a strong emphasis on multidimensional torsion distributions that relate to fragments containing more than one rotatable bond. The idea is to take into account correlations between the torsion angles of adjacent bonds. 2.1.4. Crystal Packing Effects on Molecular Conformations. Some interesting papers were published in the last 10 years on the effects of crystal packing forces on molecular conformations. Weng et al. reviewed flexible molecules that occur in more than one crystal structure (polymorphs, solvates, or cocrystals).62 As expected, conformational diversity was found to increase with the number of rotatable bonds in the molecule. Surprisingly, when the molecules were subdivided by the number of crystal environments in which they were observed (Nenv), the percentage that adopted only one conformation was about 60%, irrespective of Nenv. Common conformational changes were trans↔gauche and 180° flips of planar groups such as −CO2H. Many of the changes were forced by different H-bonding schemes. Conformational variability across different polymorphs or cocrystals was less, on average, than across differently solvated structures.

CSD reference code; details in the Supporting Information). This is explicable. The DFT-D calculations pertained to the gas phase, where the isolated molecule is likely to fold up to optimize attractive electrostatic and dispersion interactions. Conversely, extended conformations in crystal structures allow attractive interactions with neighboring molecules. The authors concluded that the calculated gas-phase energies were inadequate on their own; exposed surface area matters too. In another study, searches of the CSD and ab initio calculations were performed to find highly strained molecules.64 The calculations used a polarizable continuum model, which takes some account of solvent effects. Two types of molecules were found with high strain energies. The first were molecules such as biphenyl and bispyridinium, which have long been known to have an undue tendency to be planar in crystal structures.65 The second were cyclobutane and its derivatives, which are puckered in the gas phase but sometimes flat in crystal structures. Strain energies were calculated to be up to about 8−10 kJ mol−1. It was noted that the strained planar conformations almost exclusively occurred for molecules sited on crystallographic inversion centers. The authors’ conclusion was summarized in the title of their paper: “Systematic conformational bias in small-molecule crystal structures is rare and explicable”. The final paper focused on CSD hydrocarbon molecules situated on crystallographic special positions (almost always inversion centers).66 As in the previous study, there was a noticeable preference for some of these molecules to adopt strained planar geometries in the crystalline state, their gasphase optimum geometries being very different and typically twisted (e.g., Figure 4). It therefore appears that molecules on inversion centers can have abnormally high strain energies. Convincing examples of high strain for molecules not on special positions are much harder to find; one interesting example was reported in 2012 by Back et al.67 2.1.5. Metal Coordination. 2.1.5.1. Ligand Coordination Modes. Over half of the CSD comprises crystal structures of metal−organic compounds, so it is the definitive source of E

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

supported the established view that the trans effect is strongest for strong σ-binding ligands. However, the π-bonding ability is also a factor. For several ligands, the octahedral complexes showed slightly stronger trans effects than the square planar. A project published in the same year examined bond length−bond strength correlations by comparing the distances of metal bonds to alkoxide, carboxylate, and azolate with those of the corresponding bonds to alcohol, carboxylic acid, and azide.73 As expected, the anionic ligands tend to form shorter bonds than their neutral analogues, typically by 0.02−0.05 Å. However, the differences are relatively small, indicating that neutral and anionic ligands do not form two distinct classes of metal−ligand bonds. In another study, Holland found that O− O and N−N distances of metal-coordinated O2 and N2 are unreliable guides to the oxidation state of the attached metal because they may be artificially shortened by libration.74 2.1.5.4. Symmetry and Shape. Alvarez et al. have published an extensive and elegant series of papers on metal coordination symmetry and shape.75−80 Given a metal complex, they determine which polyhedron best describes the coordination geometry and how big the distortions are from this ideal polyhedron. The latter is quantified by finding the best superposition between the actual geometry and the ideal polyhedron and measuring the deviations between observed and actual vertices by a parameter termed the continuous shape measure (CShM).81 The method has been used to characterize the geometries of, e.g., 9- (Figure 5) and 10-

Figure 4. Observed geometries of two molecules sited on inversion centers (top) and their calculated ideal geometries (below). Reprinted from ref 66. Copyright 2012 American Chemical Society.

information about many aspects of metal coordination. Its most common use is probably to compare and classify ligand coordination modes, something often done in the course of discussing new structures. An illustrative example is an analysis of azide, thiocyanate, and cyanate binding to first-row transition metal ions.68 It showed that the ligands are usually terminal but can also be end-on (μ-1,1) or end-to-end (μ-1,3) bridging; (μ-1,1) is more common for azides and (μ-1,3) for thiocyanates. In the (μ-1,1) mode, azides usually bridge symmetrically while the other ligands can be symmetric or asymmetric, an observation that can be explained in terms of the ligand orbitals (σ or π) involved in the bonding. 2.1.5.2. Metal Coordination Numbers. A CSD investigation into metal coordination numbers looked at their dependence on the size, charge, and charge-accepting ability of the metal and the size, charge, charge-donating ability, and denticity of the ligand.69 For a given type of ligand donor atom, it was concluded that the size of the metal is more important than its charge in determining coordination number (but alkali metals may be an exception70). Conversely, for a given metal, the ligand’s charge and charge-donating ability is more important than its size. Almost a hundred types of metal ion were studied, all of which were found to adopt more than one coordination number. Unsurprisingly, odd coordination numbers were less common than even. In a separate study by Kuppuraj et al.,71 the preferred coordination geometries of 63 types of metal ions were determined. 2.1.5.3. Bond Lengths. The primary aim of the study just cited was to elucidate how metal−ligand (M−L) bond lengths depend on the properties of the metal cation and the ligand donor atom. A large sample of bond lengths from the CSD was subdivided by the coordination numbers of the metal ion (CN) and the ligand (LCN). For a given (CN, LCN) pair, M−L distances going down a group or across a row of the periodic table were found to be linearly correlated with the metal ionic radius. The metal ionic radius depends, in turn, on oxidation state, spin state, and CN. A 2013 study used CSD data to quantify the trans effect using −Cl and −PPh3 as probe ligands (PL).72 The average M−PL bond lengths in d8 square-planar and low-spin d6 octahedral complexes were determined as a function of the ligand trans to PL. Some ligands had little or no effect on the metal−PL bond (e.g., pyridine, chloride); at the other extreme, ligands such as hydride, phenyl, and triphenylphosphine lengthened the bond significantly (>0.1 Å, implying an approximately 30% reduction in bond order). The results

Figure 5. Two 9-coordinate complexes. [Pu(NCMe)9]3+ (left) is a capped square antiprism (CSAPR), while [Nd(H2O)9]3+ (right) is on the interconversion pathway between CSAPR and tricapped trigonal prism. Reprinted with permission from ref 76. Copyright 2008 WileyVCH Verlag GmbH & Co. KGaA.

coordinate compounds,76,78 Jahn−Teller distorted Cu(II) complexes,77 and complexes involving double or triple metal−ligand bonds.79 Davis et al. made a similar analysis of 3-coordinate metal complexes, showing that actual geometries are usually quite different from any of the textbook ideals (trigonal planar, T-shaped, and trigonal pyramidal).82 2.1.5.5. Spin States. When spin crossover occurs in the crystalline state, the resulting geometry changes can alter crystal symmetry. A recent review of this phenomenon, based heavily on examples taken from the CSD, focused on complexes of first-row transition metals.83 The characteristic crossover behavior is an abrupt transition to another space group as the temperature is increased. Also possible is a lowering of crystal symmetry to accommodate a mixed highspin/low-spin state. The first type of behavior is relevant to the design of spin-crossover materials that can exhibit useful properties (e.g., ferroelectricity) in one of their phases. 2.1.5.6. Ligand Cone Angles. The steric requirements of a monodentate ligand are often measured by its cone angle. The concept has now been extended to bidentate ligands, the cone angles of which depend on the ligand bite angle as well as the F

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

steric bulk of the ligand.84 Values of this parameter were listed for over 280 different phosphanes. 2.1.5.7. Polynuclear Complexes. A CSD search established that polynuclear complexes containing an even number of metal atoms are significantly more common than those with an odd number.85 Suggested reasons included a possible preference for high-symmetry complexes. The observation is reminiscent, however, of an earlier observation that molecules with even numbers of carbon atoms are more common than those with odd numbers;86 at least one synthesis-related explanation of this has been posited.87 Novikov performed theoretical calculations on CSD entries with short Ni···Ni contacts and concluded that the interactions are attractive and have significant covalent contributions when ligand-supported but usually not when ligand-unsupported.88 Other investigations include a survey of Ni(II) and Co(II) cubanes89 and an extensive analysis of semibridging carbonyl ligands.90 2.1.5.8. Magnetism. The relationships between magnetic properties and molecular structure can sometimes be clarified with the help of the CSD. An example is provided by a study of dinuclear bis(phenoxo)-bridged Cu(II) complexes.91 Most exhibit antiferromagnetic coupling, but a small number are ferromagnetic. DFT calculations and a survey of relevant CSD structures provided some insight, though the picture was complex. The coupling had a large dependence on the Cu− O−Cu angle for planar complexes. However, when the phenoxo groups were tilted strongly out of plane, the dependence on Cu−O−Cu was small. This was explained in terms of energy crossing of the two magnetic orbitals. Four geometrical characteristics were listed: two associated with ferromagnetic and two with antiferromagnetic coupling in these complexes. 2.1.5.9. Software Parametrization. The CSD can provide information for the parametrization of (semi)empirical programs (and force fields; section 3.1). For example, bondlength distributions were used to extend the semiempirical PM3 method to lanthanides. The ultimate aim was to assist the design of luminescent and other commercially important materials.92 CSD data were used to derive metal−organic bond valence parameters for metals with different spin states93 and for alkali− and alkaline-earth−oxygen pairs.94 CSD structures for which experimental sublimation energies are available were used to extend and validate the parametrization of PIXEL so that it could handle molecules containing transition metals.95 (PIXEL is a popular semiempirical program for calculating lattice energies.96) Finally, a fragment library has been derived from the CSD, together with fragment connection rules. The ambitious aim is the automated design of realistic, synthetically accessible organometallic molecules.97 2.1.6. Crystallization Propensity. An interesting 2015 paper described an empirical model for predicting crystallization propensity.98 It was developed using two training sets of molecules. The first was taken from the CSD, so the molecules had a proven ability to form crystals suitable for single-crystal diffraction. The other comprised molecules absent from the CSD; it could confidently be assumed that some would be unable to form good crystals. A large variety of descriptors were calculated for each molecule from its twodimensional structure (e.g., connectivity indices). Those most useful for discriminating between the two sets of molecules were determined by use of support vector machines. The final model had about 80% classification accuracy, determined by

crystallization experiments on a small sample of non-CSD molecules. Only two of the descriptors were used in the model: the rotatable bond count (obviously measuring molecular flexibility) and a connectivity index whose value correlated with molecular volume. Subsequently, the same authors invented an improved molecular-flexibility index, which was the most predictive descriptor of all.99 2.1.7. Tautomerism and Proton Transfer. One of the most amusing things about James Watson’s book The Double Helix is his admission that he and Crick wasted their time trying to build DNA models using the wrong tautomeric form of guanine. The crystallographer Jerry Donohue put them right, and the rest is historya particularly dramatic example of the importance of understanding tautomeric preferences. The CSD is an obvious place to look for enlightenment, though it must be done with care as hydrogen misplacement is not uncommon. Henry used CSD structures of sulfonamides and sulfonimides to illustrate how the tautomeric form adopted in the crystalline state can have a huge effect on the H-bonding network.100 Nanubolu et al. noticed that conjugation has a pronounced influence on whether amino or imino forms occur in the CSD.101 Cruz-Cabeza et al. compared the lowest-energy tautomers of various heterocycles (calculated with MP2 and a polarizable continuum model) with those observed in the CSD, finding good but not perfect agreement.102 Reasonably enough, discrepancies occurred when the energy difference between alternative tautomeric forms was small. Another survey found 108 molecules that crystallize in two different tautomeric forms.103 This usually happens when tautomer pairs occur in the same crystal structure; it is very rare for different polymorphs of a compound to contain different forms. Milletti and Vulpetti chose 13 ring systems capable of tautomerism and deduced the forms each adopted in several protein−ligand complexes.104 This was done by examining H-bonding networks. They then compared their results with the tautomers observed in water, the gas phase, and the CSD. There was a good consensus, but with the occasional discrepancy, e.g. the adenine tautomer favored in water, the gas phase, and the CSD was less common in the PDB than the alternative form. On a related theme, Cruz-Cabeza studied over six thousand crystal structures containing ionized or un-ionized acid−base pairs. She was successful at correlating the occurrence of proton transfer with the difference in the calculated aqueous pKa of the two species (ΔpKa = pKa[protonated base] − pKa[acid]).105 Thus, ionized forms were found if ΔpKa > 4; un-ionized if ΔpKa < −1; and between these limits, increasing ΔpKa by 1 increased the probability of proton transfer by about 17%. The rule was supported by a subsequent test on a matrix of acid−pyridine cocrystals.106 2.2. Intermolecular Interactions

The CSD has had a greater impact on the study of intermolecular interactions than on any other topic. Many seminal studies have been published,107 and interest in the area shows no sign of abating. As we will see, controversy surrounds some of the weak interactions that have been investigated recently. 2.2.1. Hydrogen Bonds. Being the most important intermolecular interaction by far, it is unsurprising that a great many CSD research studies have been focused on the Hbond, including several in the last 10 years or so. One looked at G

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

H-bond “coordination numbers”, i.e. the number of H-bonds that a donor or acceptor can simultaneously form.108 The distributions of this parameter were determined for over 70 different types of acceptors and donors. For example, amide carbonyl oxygen atoms were found to accept 0, 1, 2, and 3 Hbonds on 763, 1214, 189, and 21 occasions, respectively. Another paper pointed out that about 2.5% of organic structures in the CSD contain no H-bonds, despite the presence of strong donor and acceptor groups.109 In about two-thirds of these, steric factors were deemed to be responsible. In carbamazepine, for example, one of the two N−H hydrogen atoms has a low (13%) surface area accessibility, explaining why about a quarter of carbamazepine crystal structures have unsatisfied donor hydrogens. Steric inaccessibility affects the H-bonding capabilities of various donor and acceptor groups to different extents. Other reasons discerned for an absence of conventional H-bonds were formation instead of H-bonds to π-systems and good Hbonding being forsaken to achieve close packing (Figure 6).

1.8 Å if the donor water is metal bound.114 The calculated Hbond energies are stronger for the latter, even if the aqua complex is neutral. It was subsequently inferred that water molecules in tetrahedral complexes are stronger donors than those in octahedral species.115 It is well-known that O/N−H···F−C H-bonds (O/N = O or N) rarely occur in small-molecule crystal structures, fluorine being an unexpectedly weak acceptor.116 A recent CSD survey showed that when such interactions do occur, it is usually because structures contain insufficient good (O/N) acceptors or molecules have unusual packing difficulties (e.g., tertiary alcohols).117 It was noted, however, that the average (O/N−H donor)/(O/N acceptor) ratio is much higher in proteins than in CSD organic structures. This may make H-bonds to fluorine more likely in protein−ligand complexes than in smallmolecule structures. Using 19F NMR isotropic chemical shifts, Dalvit and Vulpetti showed that O/N−H···F−C H-bonds in the CSD have a propensity to involve fluorine atoms that are highly shielded, i.e. particularly electron-rich.118 The latest IUPAC definition of the H-bond includes very weak interactions.119 Accordingly, some consider C−H···F−C to be a H-bond, partly because it has been claimed to be significant in controlling crystal packing arrangements.120−122 Combined CSD and theoretical studies of the interaction found, among other things, an energetic preference for C−H··· F linearity.123,124 However, Gavezzotti and Lo Presti performed PIXEL calculations on CSD structures containing only C, H, and F and concluded that the relevance of C−H···F interactions to crystal packing was the exception, not the rule.125 A similar conclusion was drawn for C−H···Cl interactions. On the other hand, C−H···F and C−H···Cl interactions occur appreciably more often in crystal structures than would be expected at random from surface area considerations, suggesting that their role in determining packing arrangements is significant.126 In response to this finding, Lo Presti compared about 250 pairs of polymorphs with no strong H-bond donors. He agreed that interactions such as C−H···X (X = N, O, S, F, Cl) occur more often than would be expected at random but pointed out that specific contacts of this type are seldom conserved between polymorphs.127 Other interesting CSD studies were directed at the correlation between H-bonding propensities in crystal structures and solution-phase free energies;128,129 the remarkably strong tendency for ethynyl groups to donate and accept;130 the H-bonding abilities of N-oxides and nitroxide radicals;131,132 the distribution of donor-H···acceptor angles;133 and the tendency for ortho hydrogen atoms on aromatic rings to donate to the same oxygen acceptor, a partial explanation for the typically nonlinear geometry of C−H···O interactions.134,135 2.2.2. σ-Hole Interactions. Halogen bonds have been known for a long time and were investigated in early CSD studies.136,137 However, they were widely ignored until the evangelizing work of Metrangolo and Resnati at the start of this century.138,139 Since then, interest in halogen bonds and other “σ-hole” interactions has risen enormously140 and numerous research studies have been published, many using the CSD. The interaction is between a σ-hole donor and an electron-rich acceptor, ER (e.g., ER = N, O, halogen). The σ-hole is a patch of positive electrostatic potential along the extension of a bond such as C−X (X = halogen), C−Ch (Ch = chalcogen), and C− Pn (Pn = pnictogen, or “pnicogen” as some prefer). Its origin

Figure 6. Molecules that forsake H-bonding to achieve (top) stacking and (bottom) close packing of awkward shapes. Adapted with permission from ref 109. Copyright 2010 Royal Society of Chemistry.

Two studies compared the geometries of H-bonds involving C=S and C=O.110,111 The first showed that H-bonds to thiocarbonyl are typically longer than those to carbonyl by ∼0.25 Å after correcting for the different vdw radii. C=S···H angles tend to be smaller than C=O···H by a remarkable 25− 30°. The other investigation found that H-bonds can be donated to the π-electron system of the thiocarbonyl group, with the donor approaching approximately orthogonally to the thiocarbonyl plane, this being particularly likely when structures contain insufficient conventional acceptors. The differences in oxygen and sulfur H-bonding were further demonstrated by Corpinot et al.112 They compared structures containing saccharin and thiosaccharin and concluded that replacing carbonyl by thiocarbonyl can give isostructural crystals, but only if the replaced carbonyl oxygen is not involved in H-bonding. Selenocarbonyl groups were found to accept H-bonds when in an electron-rich environment, e.g. in selenoureas.113 H-bonds involving water still attract attention. Andrić et al. found that H···O distances for water···water interactions usually lie in the range of 1.8−2.0 Å, but they drop to 1.6− H

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

has been explained by others141 and will not be dwelt on here. We note, however, that heavier elements, being more polarizable, can form more electropositive σ-holes, especially when they are in electron-withdrawing environments. 2.2.2.1. Halogen Bonds. Historically, considerable CSDbased research was focused on interhalogen (C−X···X−C) interactions, particularly those in which both atoms were the same halogen.142−144 The interaction geometry is defined by the X···X separation and the two C−X···X angles, θ1 and θ2. Two geometries predominate, Type I with θ1 ≈ θ2 (i.e., Δθ ≈ 0°) and Type II with θ1 ≈ 180°, θ2 ≈ 90° (i.e., Δθ ≈ 90°).142 The latter are halogen bonds, with the σ-hole of one X pointing to an electronegative region of the other; the former are the products of close-packing. More recent papers discussed further aspects of X···X interactions. For example, Type I contacts occur most often in space groups with inversion centers or symmetry planes, whereas Type II are associated with glide planes and screw axes.145 For I···I contacts, Type I predominates at short I···I distances, Type II at intermediate distances, and Type I again at long distances.146 The distributions of θ1 and θ2 require geometric (“cone”) corrections, and when this is done, the propensity for Type II contacts becomes more apparent for Cl···Cl, Br···Br, and especially I···I, but not for F···F (Figure 7).147,148 In fact, fluorine is generally believed not to form σ-holes because it is difficult to polarize. Even on the very rare occasions when F···F contacts are in the Type II geometry, they are sometimes attributable to packing effects.150 Occasional claims that F···F contacts are significant in stabilizing crystal structures are controversial.151 Esterhuysen et al. suggested that CF3 groups can form stabilizing F···F contacts, pointing to their relatively high occurrence in the CSD and ascribing this to polarization and dispersion but not halogen bonding.152 However, Metrangolo et al. presented theoretical evidence that fluorine can form a σ-hole, and therefore donate halogen bonds, when in exceptionally strong electron-withdrawing environments. Some of the putative examples they found in the CSD are reasonably convincing (Figure 8).153,154 Other aspects of halogen-bond geometry were investigated in three independent and approximately contemporary studies.155−157 In the most extensive of these, Le Questel et al. compared the “normalized distances” of various types of halogen bonds, i.e., the X···ER distance divided by the sum of the atoms’ vdw radii. The mean normalized distances of halogen bonds donated by C(sp)−I were noticeably shorter than those donated by C(sp2)−I and C(sp3)−I, which were roughly equal. This suggests that iodine bonded to acetylenic carbon donates the strongest halogen bonds. Indeed, 96% of C(sp)−I groups formed halogen bonds, but only 58% and 31% of C(sp2)−I and C(sp3)−I groups, respectively. There was some evidence that shorter halogen bonds tend to be more linear. C(sp2)−I halogen bonds to nitrogen appeared to have smaller normalized distances than those to other elements. Finer subdivision of the acceptors revealed other trends, e.g. bonds to aliphatic amines were shorter than those to anilines. By assuming an inverse strength−length correlation, different types of atoms were ranked by their halogen-bond acceptor ability. There was also evidence that C−I and I···ER distances are inversely correlated. Conversely, Ji et al. investigated the possibility of “improper halogen bonds”, where halogen bonding causes the covalent bond formed by the donor halogen to shorten rather than lengthen.158

Figure 7. (a) Distributions of Δθ (see text) for F···F, Cl···Cl, Br···Br, and I···I contacts; (b) the same distributions after correction for geometric factors149 showing that the heavier halogens (I and Br) have a distinct preference for Type II geometry, but fluorine does not. Reprinted from ref 147. Copyright 2016 American Chemical Society.

An enjoyable paper by Troff et al. reviewed the halogenbonding patterns in crystal structures.159 Many beautiful motifs were illustrated, involving a wide variety of acceptors (e.g., N, O, Se, halide, and triiodide); an example is reproduced in Figure 9. Three reports discussed halogen bonds that are bifurcated at the acceptor or the donor.151,160,161 That the former occurs is unsurprising, but it was not obvious that the latter would be possible given that halogen bonding is very directional at the donor. An important CSD survey established that halogen bonds are pervasive in the crystal structures of transition metal, main group metal, and metalloid complexes.162 Halogen atoms in metal complexes can be the donor or acceptor, and metal coordination can lead to much stronger interactions.163 Hypervalent halogen species can also participate in halogen bonding.164−166 The enthusiasm with which halogen bonds are regarded is not universal. After performing CSD analyses and calculations with the respected PIXEL program, Gavezzotti opined that “short halogen−oxygen and −nitrogen contacts are restricted to systems with peculiar electronic and steric properties”.167 He concluded that halogen bonds are hardly competitive with H-bonds. Subsequently, he and Carlucci emphasized the I

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 8. Putative F···O halogen bonds in (A) GEDLEH, (B) VIHNEF, and (C) SIZPIB. Reprinted from ref 153. Copyright 2011 American Chemical Society.

analyses of crystal structures.170 This study got right to the heart of the matter: when electrophiles approach divalent sulfur (Y−S−Z), they do so approximately 20° from the perpendicular to the Y−S−Z plane; when an electron-rich atom approaches, it does so approximately along the extension of one of the sulfur covalent bonds. The latter is what is now called a chalcogen bond. Some 25 years later, an extensive CSD analysis produced the same conclusion and also showed that the sulfur atom in an S···O=C interaction can be positioned approximately in or perpendicular to the carbonyl plane.171 Both studies noted that the approach direction of electron-rich atoms is along either the S−Y or S−Z σ* orbital, suggesting that the authors suspected orbital mixing between this and the HOMO of the electron-rich atom. Nowadays, it is more usual to focus on the electrostatic attraction between the electron-rich atom and the sulfur σ-hole positioned on the extension of Y−S or Z−S.172 S···O and S···N interactions are frequently intramolecular (perhaps because the σ-holes are often sterically unable to form intermolecular contacts) and can exert a major influence on molecular conformations. This was noted, for example, in several crystal forms of the antibiotic sulfamethizole.173 The use of S···O and S···N “conformational locking” has been recommended as a useful tool in drug design.174 Nevertheless, it is easy to find intermolecular chalcogen bonds. The likelihood of their occurrence is enhanced if the sulfur atom is highly polarized by nearby electron-withdrawing moieties. Thus, a search of the CSD for intermolecular nitrogen or oxygen contacts to thiophene sulfur with C−S···O/N ≥ 160° found 242 examples in the 5132 structures in which the interaction was theoretically possible, a hit rate of 4.7%.175 Corresponding hit rates for contacts to thiazole, 1,3,4-thiadiazole, and thiazolium sulfur were 9.3, 13.5, and 31.4%, respectively, rising as the sulfur environment becomes increasingly electronwithdrawing. Similarly, Nayak et al. found that fluorination promotes chalcogen bonding.176 By analogy with the halogen bond, the heavier chalcogens might be thought more likely to form σ-hole interactions.

Figure 9. Packing motif with iodine···chloride halogen bonds in DAKVES. Reprinted with permission from ref 159. Copyright 2013 Wiley-VCH Verlag GmbH & Co. KGaA.

importance of electron-withdrawing substituents to polarize adjacent halogen atoms and thereby improve their halogenbond donating ability.168 Another statistical analysis of the CSD showed that the number of Cl···O and Cl···N interactions in organic crystal structures is not significantly higher than what would be expected at random, unless the chlorine is in a polarizing environment.169 2.2.2.2. Chalcogen Bonds. The chalcogen bond has been less thoroughly investigated than the halogen bond but nevertheless was the subject of one of the first systematic J

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 10. Observed intermolecular contacts in (left) TETXUL01, (center) ZEKPOW02, and (right) ZAQCOL01 that could be ascribed to Hbonding or pnictogen/tetrel bonding.

approximately along this axis. Of course, the nucleophile in an SN2 reaction approaches roughly the same way. Politzer et al. acknowledged that there is nothing new about these interactions; they were recognized and studied long ago.184 But what is new, they declared, is the explanation in terms of σholes. Thomas et al. searched the CSD for possible carbon bonds involving methyl carbon as the donor, i.e., −CH3···ER.185 They found many, but the stability of the contacts could be ascribed to carbon bonding or C−H···ER H-bonds. Echoing the study described in the preceding section, they collected highresolution data sets, this time for two different structures, and subjected the resulting charge density models to AIM analysis. In one of the structures, ZEKPOW02 (Figure 10, center), a BCP was found only for one of the C−H···ER interactions. In the other, ZAQCOL01 (Figure 10, right), none of the C−H···ER interactions had BCPs but the C···ER interaction did. They concluded that the latter was a carbon bond. Once again, we consider this debatable. Other putative carbon bonds to sp3 carbon were picked out from the CSD by Bauzá et al.186,187 Quiñonero presented evidence that carbon bonds can be formed by methylene carbon, =CH2, the interaction forming approximately in the CH2 plane and on the extension of the =C bond.188 Possible examples were found in the CSD. Once again, interpretation of these interactions is complicated by the simultaneous presence of C···ER and C−H···ER. Investigations have also been performed into tetrel bonds involving the heavier group 14 elements.189−193 As with pnictogens, the σ-hole explanation is challenged by the possibility of hypervalency. 2.2.2.5. Aerogen Bonds. Bauzá and Frontera searched the CSD for short intermolecular contacts to noble gases.194 They found that the crystal structure of the pyramidal molecule XeO3 contains three Xe···O interactions longer than the sum of covalent radii but shorter than the sum of vdw radii. These, they concluded, were σ-hole interactions (“aerogen bonds”), a hypothesis they supported with calculations of electrostatic potentials. A second putative interaction was found in the structure of XeF2O cocrystallized with CH3CN. They noted that it had been described by the original authors as a coordination Xe(IV)−N bond. 2.2.3. Dipole−Dipole and Orthogonal Multipolar (πHole) Interactions. In 1998, Allen et al. surveyed the geometries of intermolecular ketonic carbonyl···carbonyl interactions in organic crystal structures.195 They found three common arrangements: slightly sheared antiparallel, with two short C···O contacts (i.e., antiparallel dipole−dipole stacking); highly sheared parallel, with one short C···O contact; and approximately orthogonal, with one short C···O. The former was the most common, and its energy was calculated to be comparable with that of medium-strength H-bonds. (A very

Certainly, many short contacts between selenium or tellurium and atoms such as N, O, P, and halogens have been found in the CSD.177,178 A difficulty with these heavier elements, however, is that it becomes increasingly difficult to distinguish between chalcogen bonds and interactions with a high degree of covalency. 2.2.2.3. Pnictogen Bonds. Politzer et al. searched the CSD for interactions between trivalent nitrogen, phosphorus, or arsenic and possible pnictogen-bond acceptors.179 They required the putative acceptor to be approximately opposite one of the N/P/As covalent bonds, which is where the σ-holes are positioned. Very few interactions were found for nitrogen (all to fluorine), but an appreciable number were found for the more polarizable phosphorus and arsenic. Because the N···F interactions were all rather long, most of them were dismissed as possible pnictogen bonds. Conversely, many of the contacts to phosphorus and arsenic were close to or smaller than the sum of vdw radii, and it was concluded that the majority of these were indeed σ-hole interactions. The authors noted that an earlier CSD survey of short contacts between trivalent antimony or bismuth and electronrich atoms had been performed.180 The large majority of contacts found were trans to an electron-withdrawing substituent, which Politzer et al. argued was the preferred direction for a pnictogen bond. However, the original authors of the study referred to them as intermolecular hypervalent interactions, so-called “secondary bonds”. Also of interest is a survey of As, Sb, and Bi interactions with arenes published in 2010.181 Sarkar et al. made a determined effort to find pnictogen bonds to nitrogen.182 They found several CSD structures containing possible interactions of this type and chose one of them for further study, a cocrystal between chloracetic acid and an aminopyridine derivative. The contact in this structure was C−Cl···NH2R (Figure 10, left), which might perhaps be stabilized more by the N−H···Cl contacts than the N···Cl “pnictogen bond”. They therefore redetermined the structure from a high-resolution X-ray data set, performed electrostatic potential calculations, and applied “atoms in molecules” (AIM) theory to the experimental charge density. This confirmed the presence of a σ-hole on the nitrogen. Further, they concluded that a (3, −1) bond critical point (BCP) existed between the N and Cl atoms but not between the H atoms and Cl. They deduced from this that the contact is a pnictogen bond and ruled out the relevance of the N−H···Cl hydrogen interactions. We consider this conclusion debatable (section 2.2.5). 2.2.2.4. Tetrel Bonds. σ-Holes in group 14 elements occur on the extension of the covalent bonds, and the most electropositive σ-hole is expected on the extension of the bond to the most electron-withdrawing substituent.183 Thus, the tetrel bond (also called “carbon bond” when the group 14 atom is C) might form when an electron-rich atom approaches K

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 11. CSD distributions of θ (angle between interacting dipoles) for (a−c) 0° < θ < 180°; (d−f) 8° < θ < 172°; and (g−i) chemically inequivalent nitriles, ketones/aldehydes, and C−F units. Reprinted from ref 199. Copyright 2004 American Chemical Society.

recent update on CSD carbonyl···carbonyl interactions confirmed this as the most common geometry.196) Similar results were later obtained for nitrile···nitrile interactions.197 The interactions of carbonyl groups coordinated to transition metals were found to be slightly different, in that only two arrangements were observed: 45% had a sheared antiparallel motif, and 55% had the orthogonal geometry.198 The difference was partly ascribed to steric factors. So polar bonds such as C=O frequently form antiparallel dipole−dipole interactions. However, an admirable CSD investigation by Lee et al. provided strong evidence that this is largely due to crystal-packing effects.199 They investigated the pairwise intermolecular interactions between three types of bonds (C−Z), viz. C=O···C=O (ketones and aldehydes), C≡N···C≡N, and Csp2−F···Csp2−F. For all three, the distributions of the angles between the C−Z vectors showed a huge peak at exactly 180°. This was due to crystal structures in which the molecules containing the interacting bonds were related by crystallographic inversion symmetry. In this situation, all the bonds in one molecule will be at 180° to their symmetry equivalents in the other. Consequently, the packing arrangement is stabilized by an ensemble of antiparallel dipole−dipole interactions, not just one. When the authors constructed distributions based only on interacting bonds that were unrelated by crystallographic symmetry or pseudosymmetry, the peaks at or about 180° disappeared

(Figure 11). It therefore appears that antiparallel arrangements of polar bonds in small-molecule crystal structures are largely due to a synergistic effect produced by inversion symmetry. There is an important corollary: proteins are chiral, so the favorable disposition of bond dipoles around inversion centers is not something that can influence protein−ligand binding. In this respect, therefore, the intermolecular interactions in the CSD are a biased guide to what is likely to occur in vivo. It is therefore interesting that Paulini et al. highlighted the orthogonal arrangement of bond dipoles as an important stabilizing interaction in protein−ligand binding.200 They referred to orthogonal C−F···C=O and similar contacts (H2O···C=O, C−F···NO2, etc.) as “orthogonal multipolar interactions”. Extensive searches of both the CSD and PDB found a great many examples. A superposition of the shortest C−F···C=O contacts in the CSD (2.77 Å < F···C < 3.09 Å) suggested that the fluorine atom has a pronounced preference to approach along or close to the pseudotrigonal axis of the carbonyl group, i.e. directly above the carbonyl carbon. A later CSD survey looked at C−F and C−Cl interactions with either atom of the carbonyl group.201 While fluorine was not found to show any strong orientational preferences (longer contacts were included than in the superposition mentioned above), the chlorine distribution had two distinct clusters, one corresponding to Cl···O halogen bonds and one to Cl···C=O orthogonal multipolar interactions. In the latter contacts, the L

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 12. “π-Hole” interactions in the CSD. Reprinted from ref 208. Copyright 2014 American Chemical Society.

due to electrostatic attractions between the electron-rich atom and the carbonyl-carbon π-hole. The significance of orbital involvement and electron delocalization is thereby downplayed. There have been several recent CSD analyses exploring this theme. Bauzá et al. found many contacts between nitro nitrogen and Lewis bases and anions (e.g., BF4−, Cl−, and PF6−; Figure 12).208 They established that π-hole interactions to C−NO2 and nitrate esters both show strong directional preferences.209,210 Two papers reported that even nitrate ions can accept π-hole interactions provided the negative charge is reduced by delocalization, e.g. by forming strong H-bonds or coordinating to a metal ion.211,212 π-Hole interactions between nitrate ions have been proposed, e.g. in CSD structure UHECAL. Other studies into π-hole interactions have been performed on nitro,213 trigonal boron (“triel bonds”),214 and xenon derivatives.215 2.2.4. Aromatic Interactions. 2.2.4.1. Stacking of Aromatic Rings. CSD investigations into ring stacking are numerous and often supplemented by theoretical calculations. Their focus is usually on the types of systems that stack and the degree of offset that occurs. The latter is measured by the lateral displacement between ring centroids, so rings exactly face-to-face have an offset of zero. Studies into the stacking of chemically identical aromatic ring systems were performed by Główka and later by Choudhury and Chitra (C&C).216,217 Główka observed that stacking is common, provided bulky substituents are absent, and stacked rings are usually related by crystallographic symmetry. On the other hand, C&C found that stacking of unsubstituted hydrocarbon aromatics is important only for molecules with more than three rings. They also found that substitution by electron-withdrawing substituents enhances the likelihood of stacking. Both studies found that exact face-to-face stacking is rare, from which it was deduced to be unfavorable. Nitrogen heterocycles are more likely to stack than arenes. Further, their propensity to stack increases with the number of nitrogen atoms and is also enhanced if any ring nitrogen accepts a H-bond. This last conclusion was supported by later studies, which found that Hbonded pyridine rings have a clear preference for offsets of 1.25−1.75 Å. When the rings are not H-bonded, the offset is very variable and the separation between the ring planes increases.218,219

angle between the C−Cl vector and the normal to the carbonyl plane had a marked tendency to be close to 90°. This was ascribed to the anisotropic distribution of electron density around chlorine; approach directions that would expose the chlorine σ-hole to the carbonyl carbon are disfavored. Kamer et al. argued that functional groups interacting favorably with a carbonyl group need not have a dipole.202 This they demonstrated by finding many examples in the CSD of X−···C=O contacts (X = halogen) with X−···C distances less than the sum of vdw radii. Halide, of course, is a monopole. The X−···C=O angles fell mainly on the Bürgi−Dunitz trajectory, the pathway that models the approach of a nucleophile undergoing addition to a carbonyl group,203 and Kamer et al. concluded that this type of contact is best viewed as an n → π* interaction, i.e. involving charge transfer. Others share this belief.204,205 It is supported by a repeatedly observed red shift in the stretching frequencies of carbonyl groups forming this type of contact. Carbonyl−carbonyl contacts in metal coordination complexes were revisited very recently.206 In CSD entries containing at least one terminal M−C≡O group (M = metal) and one nonmetal carbonyl, a close contact (less than the sum of vdw radii) between the oxygen atom of the former and the carbonyl carbon of the latter was found in only 3.5% of the structures. The contacts did not show marked directionality. In contrast, the reverse arrangement was found in 22.6% of structures and had clear geometric preferences, with O··· C≡O angles of 90−100°. It was argued that angles of around 100° should maximize the overlap between donor lone pair and the acceptor π* antibonding orbital in transition metal complexes. It was also found that M−C≡O angles were less linear if the O···C distance was particularly short, which the author presented as evidence of orbital involvement in the interaction. Another recent theoretical, CSD, and PDB study focused on the “reciprocal carbonyl−carbonyl interaction”, in which it is hypothesized that n → π* delocalization occurs in one direction, with π → π* back-donation in the other.207 The orbital-interaction model competes with the electrostatic interpretation. This school of thought says that an atom such as carbonyl carbon has “π-holes”, small patches of positive electrostatic potential located in the directions of the π* orbital. Orthogonal interactions such as C=O···C=O and C− F···C=O are thus viewed as “π-hole interactions” and primarily M

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

around aromatic rings.240 O−H···π H-bonds were found to be rare, and parallel alignments were quite common (about 16.0% of the data set); unsurprisingly, C−H···O H-bonds were the most frequent interaction. 2.2.4.4. Interactions between Aromatic Rings and Halogens. Three CSD surveys all showed that intermolecular interactions between aromatic rings and covalently bonded halogen atoms most commonly form near to the ring edge.201,241,242 They are presumably attractive C−X···H−C contacts. However, the studies also found a significant number of halogen atoms positioned above the ring plane. The tendency for this position appeared stronger when the distribution was corrected for geometrical factors and was particularly pronounced for the heavier halogens.242 The authors of another study noted that X···π interactions can significantly enhance protein−ligand binding.243 Some of the X···π contacts that they found in the CSD were part of larger stacking interactions between offset aromatic moieties, so the C−X bond was roughly parallel with the aromatic ring plane, but many more had the C−X bond tilted by 60−90°. They concluded that the stability of these contacts can primarily be attributed to dispersion. However, a halogen atom (excluding F) in this orientation exposes its σ-hole to the π-cloud, so the interaction might also be interpreted as a halogen bond.242 2.2.4.5. Anion−π and Lone Pair−π Interactions. The idea of an attractive interaction between an anion or lone pair and a π-system seems at first sight nonsensical; they are both electronegative. The trick is that it occurs only when the aromatic ring is substituted by strongly electron-withdrawing substituents. Interest in the interaction was sparked by Alkorta et al., who remarked on the tendency of C−F groups to point toward the center of perfluoroaromatic rings.244 Quiñonero et al. followed this up with CSD surveys that found many apparent interactions between electronegative atoms, many of them anionic (e.g., F in BF4−), and pentafluorophenyl and 1,3,5-trinitrobenzene.245,246 However, a series of arguments then ensued. Hay and Bryantsev identified four possible interactions between anions and arenes: C−H···anion H-bonding; anion−π; strongly covalent σ interaction (Meisenheimer complex); and weakly covalent σ interaction (incipient Meisenheimer). Of these, they found H-bonds to be by far the most common in the CSD. Strong σ complexes were held to be most likely for nucleophilic anions such as F− and RO−, with anion−π interactions favoring large, charge-diffuse anions, e.g., PF6− and ClO4−. However, anion−π contacts to charge-neutral arenes were rare.247 Ironically, Mooibroek et al. reached the opposite conclusion in exactly the same year, writing “a careful look in the Cambridge Structure Database reveals that many of these [anion−π] interactions are not uncommon but in fact have been overlooked in the past”.248 Hay and Custelcean (H&C) responded, arguing again that the interaction is uncommon for charge-neutral arenes and suggesting that Mooibroek et al. had used search criteria that were too loose.249 Au contraire, replied Estarellas et al.; the vdw radius of an anion is larger than that of the neutral atom and the appropriate distance criterion is not the sum of standard vdw radii but 0.8 Å greater.250 The theoreticians Wheeler and Houk had also intervened, arguing that it was incorrect to regard the interaction as an electrostatic attraction between the anion and the electrostatic potential above the arene ring, the latter rendered electropositive by electron-withdrawing substituents. High-level calculations instead ascribed stabilization to direct interactions

Geronimo et al. found that positively charged nitrogen heterocycles (pyridinium and imidazolium) form offset, antiparallel stacking interactions much more often than edgeto-face contacts. The close proximity of the anionic counterions is very important in stabilizing the stacking arrangement.220 Aromatic rings in sandwich and half-sandwich compounds can also form offset stacking interactions with the rings of neighboring molecules.221 Janiak showed that stacking between metal-coordinated nitrogen heterocycles is common but invariably offset, resulting in less than 30% overlap of the stacked ring systems.222 The propensity of quinoline moieties to stack was confirmed by a later CSD survey; about 70% were stacked, almost always in offset fashion.223 Stacking is not limited to aromatic rings. There have been several papers on the stacking (with each other or with aromatic rings) of delocalized chelate rings, rings formed by intramolecular H-bonds, or other nonaromatic systems such as neutral tetrathiafulvalene.224−231 Some of these interactions can be face-to-face, or nearly so. While the CSD provides useful information on ring stacking, it must be acknowledged that the most profound insights have come from ab initio calculations. Of particular importance is the work of Wheeler, Houk, and their collaborators.232−234 Their calculations indicate that the effect of substituents or heteroatoms on the interaction energy of stacked aromatic rings is primarily determined not by the π−π interaction but by the direct, local interactions between the substituents/ heteroatoms and adjacent atoms in the other ring. 2.2.4.2. Edge-to-Face and C−H···π Interactions. It is wellknown that edge-to-face dispositions of aromatic rings are very common and usually stabilized by C−H···π interactions. Such interactions also occur frequently between aliphatic chains or rings and aromatic rings. They have been studied in great depth by Nishio and his co-workers, making extensive use of the CSD.235,236 Most CSD work in this area is quite old, but a 2012 study was performed into the distribution of D−H groups (D = C, N, O) around phenyl rings. It was concluded that OH groups are preferentially located around the ring edge, forming C− H···O H-bonds, but N−H and C−H tend to lie above the ring, i.e., N−H···π and C−H···π.237 Escudero et al. performed a combined theoretical and CSD investigation into T-shaped pyridine···phenyl interactions, where the para C−H of the pyridine moiety points toward the phenyl π-cloud.238 They concluded that the environment of the nitrogen affects the propensity for a T-shaped geometry. Thus, in all examples of the interaction in the CSD, the pyridine nitrogen was either protonated, alkylated, coordinated to a metal, or H-bonded; it was never “naked”. 2.2.4.3. Interactions between Aromatic Rings and Water. Two interesting papers focused on interactions between aromatic rings and water. In the first, CSD evidence showed that water molecules are frequently oriented with one or both O−H bonds parallel to the aromatic ring plane.239 The water is commonly offset so that it lies above a C−H bond rather than the ring itself, perhaps enabling an attractive bond−dipole interaction. Of course, the placement of water hydrogen atoms is sometimes unreliable in X-ray structures, and in any case, the positions of the protons will be determined primarily by the Hbonds they form. Nevertheless, the stability of the parallel alignment geometry was supported by ab initio calculations. The second paper looked more broadly at water molecules N

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

molecules. However, although strongly advocated by some, e.g., Lecomte et al.,262 the soundness of this approach has been called into question by several authors.263−267 Alhameedi et al. recently advocated the use of Roby−Gould bond indices as an alternative to AIM, demonstrating it on several CSD structures.268 If an AIM analysis of intermolecular contacts is at variance with chemical intuition, we are inclined to view the result with circumspection. In any case, analysis of molecular packing in terms of atom−atom interactions is often helpful, but we should never forget that it is molecules and ions as a whole that interact. A 2017 article by Mackenzie et al. offers a refreshing and iconoclastic viewpoint.269 They question the extent to which “noncanonical” interactions (e.g., σ-hole interactions) enhance our understanding of the relationships between molecular structure, crystal structure, and physical properties. Then they ask “whether we are converging on the requisite intimate, and ultimately useful, understanding of why molecules and ions are arranged in crystals as observed, or merely cataloguing an increasing number of examples of relatively weak intermolecular “interactions” while ignoring their common origins?” (our italics). It should make us all think. We are not interaction naysayers: by all means, let new interactions be found if they should, but let caution be the watchword.

between the anion and dipoles introduced by the ring substituents.251 Frontera et al. acknowledged that this might be right but dismissed it as a semantic issue, the important practical point being that the interaction was cohesive.252 They, too, criticized the search criteria of H&C and further argued that an observed tendency for anions to lie away from the ring center was explicable on electrostatic grounds. They presented several 3D plots of the CSD distributions of anions around aromatic rings, some more convincing than others. Another paper argued that previous publications questioning the anion···π interaction were misleading because they included intramolecular contacts in their CSD distributions.253 The credentials of the anion−π interaction seem, on balance, to be accepted, at least in some circumstances. Other relevant CSD studies include lone-pair interactions of group 13 and 15 elements with aromatic rings254,255 and a different slant on lone pair interactions with aromatic rings, ascribing them to n → π* orbital mixing.204 2.2.4.6. Cation−π Interactions. Historically, our understanding of the cation−π interaction owes more to theoretical calculations and the PDB than to the CSD. The dearth of CSD studies continued into recent years, but a couple of papers can be mentioned. Abraham et al. investigated whether an aromatic ring can form cation−π interactions to alkali metal cations on both sides of the ring. They searched the CSD but found very few examples, only two of which they interpreted as genuine 2:1 cation···π complexes.256 Estarellas et al. investigated whether an aromatic ring can form a cation−π interaction on one side and a lone pair-π on the other.257 Theoretical calculations encouraged the idea and a search of the CSD found “several” possible examples. Two were illustrated, neither very convincing because, in each, the lone pair−π contact was intramolecular and therefore possibly forced. 2.2.5. All That Glisters Is Not Gold. We end this section with some words of caution. Just because a particular type of intermolecular contact is seen in a few crystal structures, it does not mean that it is significant. Every atom must be somewhere, and it is unwise to assume that all contacts in a crystal structure are stabilizing.258 Hence, there is a danger in cherry-picking structures from the CSD that contain contacts of a type whose importance is being proselytized. Just as important are all the structures in which the contact could occur but does not. One of the present authors and, independently, Jelsch et al. have attempted to address this problem by determining the enrichment factors of various interactions, i.e. the extent to which they occur relative to what would be expected at random.126,169,259−261 However, much remains to be done. It may sometimes be reasonable to suggest that a particular type of interaction is weak but nevertheless useful as a synthon in crystal engineering, i.e. robustly influences how molecules pack. However, the only convincing proof comes when it is successfully used for this purpose and on several occasions (and even then we must remember that researchers are more likely to publish their successes than their failures). The CSD can tell us much about the geometries of interactions, the frequencies with which they occur, and factors that influence their likelihood of occurrence. However, it cannot give direct information about interaction energies and electronic features such as σ-holes. Therefore, surveys of the CSD are often coupled with theoretical calculations. An extremely popular option is to use the AIM method to determine whether BCPs occur between atoms in different

2.3. The Systematics of Crystalline Assemblies

The CSD has been heavily used to investigate the systematics of molecular packing in crystal structures. This is not merely of academic interest: a deep understanding of this subject will facilitate the design of useful crystalline materials. Topics of interest include structure-determining interaction motifs; the architectures of crystal structures; their symmetries; and the complicating factors of polymorphism, cocrystals, and solvates. 2.3.1. Motifs and Synthons. The concept of crystalstructure interaction motifs has been around for a long time. Perhaps the best known (though by no means the most reliable) is the carboxylic acid dimer motif, wherein two −CO2H groups approach each other to form a pair of O−H··· O=C H-bonds, thereby creating a ring. This was one of several carboxylic acid motifs mentioned in the pioneering 1976 paper of Leiserowitz (which, however, did not use the CSD).270 Another famous motif, this time involving hydrophobic groups, is the “phenyl embrace” between −XPh3 groups.271 If a motif is sufficiently predictable, i.e. its presence in the crystal structures of certain molecules (or sets of molecules or ions) can be anticipated with reasonable confidence, it becomes a synthon, something that can be used in crystal engineering. Desiraju expressed it thus: “Supramolecular synthons are structural units within supermolecules which can be formed and/or assembled by known or conceivable synthetic operations involving intermolecular interactions”.272 Synthons tend to be kinetically favored, so they occur easily during crystal nucleation and typically involve strong, directional intermolecular interactions.273,274 The role of the CSD in identifying synthons and exploring their robustness and hierarchies goes back about three decades. It is discussed in an excellent and very recent review on the design of molecular crystals,275 so we will confine our discussion to a few illustrative examples. Synthons involving ionized or un-ionized carboxylic acid groups have been particularly well studied. In addition to the well-known cyclic dimer, over a dozen other carboxyl− carboxyl(ate) motifs have been observed, including eight O

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

discerned and labeled (e.g., “Borromean”, “Hopf”).291 Apart from their usefulness, the networks found by this method can be beautiful and often of mind-boggling intricacy. An example is shown in Figure 13. Other workers have used graph set

different types of catemers.276 The number of motifs is increased by the ability of the acid proton to adopt two different positions, syn or anti to the carbonyl group. The situation is further complicated when competing Hbonding groups are present. Shattock et al. found that only 34% of carboxylic-acid containing CSD entries had the acid group interacting with itself (i.e., forming a homosynthon).277 The corresponding figure for alcohols was even lower, at 26%. In the remaining entries, the acid or alcohol groups interacted with other H-bonding species, such as aromatic nitrogen, chloride, or −CONH2. When −CO2H and pyridine nitrogen were present in the same structure, but no other H-bonding groups, the CO2H···N(pyridine) heterosynthon occurred 98% of the time, suggesting that this is an excellent synthon. The corresponding figure for OH···N(pyridine) was also high, at 78%. Other studies looked at the synthons that form in structures containing both carboxyl(ate) and alcohol or phenol groups. Heterosynthons between carboxylate and phenolic hydroxyl occur very frequently, even in structures with competing H-bonding groups.278 On the other hand, when carboxylic acid, phenol, and chloride ions are all present, there is a strong tendency for phenol···chloride H-bonds, despite the fact that phenol is less acidic than carboxylic acid.279 Similarly, Aakeröy et al. found −OH to be a more effective donor than −CO2H in hydroxybenzoic acid structures.280 Another interesting study looked at the synthons between carboxylic acids and carboxamides.281 The CSD has been used to investigate potential synthons that do not involve strong H-bonds but are instead based on halogen bonds, weak H-bonds (e.g., N−H···S and H-bonds involving C−H donors), anion···π interactions, or other weak interactions.151,282−285 Siddiqui and Tiekink searched the CSD for synthons based on interactions between C−H groups and metal atoms.286 As we noted in section 2.2.1, however, concerns have been expressed about the reliability of some synthons involving C−H groups.127 Sander et al. introduced the concept of “masked synthons”, where solvent molecules intercede between the interacting groups.287 How reliable these more esoteric types of synthons are is open to question. Of course, even if a synthon involves strong H-bonds and is known to be robust, it is only a small part of the totality of intermolecular interactions in a crystal structure; “secondary synthons” will also play a part in determining the packing arrangement.106 2.3.2. Crystal-Structure Architectures. We just mentioned the totality of intermolecular interactions in a crystal structure. This can be enormously complex and difficult to comprehend. So too can the intramolecular architectures of some of the huge, polymeric metal−organic compounds in the CSD. Insight can be obtained, however, by extracting and classifying the topologies of the underlying networks in these structures. The best known programs for performing this task are probably TOPOS and its successor, ToposPro, both developed by Blatov, Proserpio, and their collaborators.288−291 Underlying nets are extracted by simplifying structures. For example, a functional group or metal cluster is represented as a single node; terminal or isolated nodes are deleted; μ2-ligands are represented as simple edges; and so on.289 The resulting nets may be 1D, 2D, or 3D. Sets of crystal structures can be found that contain nets of the same topological type, hence revealing architectural similarities that would otherwise be extremely difficult to detect. Structures may contain two or more entangled nets, and the mode of entanglement can be

Figure 13. TOPOS representation of the entangled H-bonded layers in CSD entry FUYBUX. Reprinted from ref 291. Copyright 2014 American Chemical Society.

analysis to explore chiral, helical H-bonded supermolecules.292 Network analysis has considerable potential for describing metal−organic frameworks (section 4.2.8), and IUPAC set up a task force with a CCDC representative to produce terminology guidelines for node assignment methods and the networks themselves.293 A radically different classification approach was tried recently by Motherwell. 294 It involved calculating the interaction energy between a reference molecule in a structure and each of the 16 neighboring molecules with which it interacted most strongly (a “molecular coordination sphere”). The packing arrangement was then classified by (a) the profile of the 16 interaction energies, (b) whether the most tightly bound neighbors fell into one or more planes, and (c) whether the 2D projection of the neighbors down a unit cell axis showed symmetry. When tested on a sample of 1000 CSD structures, a surprising number were found that had identical packing architectures when judged on these criteria. Other methods for characterizing and comparing crystalpacking arrangements include those of Galek, who used moments-of-inertia tensors;295 Spackman et al., who used Hirshfeld surfaces onto which curvedness or electrostatic potential was mapped to investigate packing in P41212 and P43212 crystal structures from the CSD;296 and Collins et al., who based their comparison on the correlation of fingerprint plots derived from Hirshfeld surfaces coupled with cluster analysis.297 The latter paper also lists other crystal-structure classification techniques. Methods developed in an industrial context are discussed in section 4.1.1. While network analysis of H-bonds can give insights into the causes of crystal packing arrangements, it is recognized that the most significant factor is the need for close packing of molecules, the shapes of which are therefore important. In a 2010 analysis, molecular shape was characterized by (a) the rectangular box whose edges were parallel to the molecule’s moments of inertia and which just enclosed the vdw envelope and (b) parameters describing unoccupied regions within the box.298 This was then used to find sets of CSD structures P

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

graphic inversion centers. Consequently, the probability of finding these motifs in the necessarily noncentrosymmetric structures of chiral substances is much lower than in those of achiral molecules or racemic crystals.306 In the same year, Dey and Pidcock similarly found that carboxylic acid and cis-amide dimers prefer centrosymmetric groups, irrespective of the chirality of the molecule (i.e., by the formation of racemic crystals if necessary).307 In addition, structures containing an amide H-bonded chain motif or an O−H···O−H interaction were discovered to crystallize in Sohncke space groups more often than expected, even when the molecule was achiral. (Sohncke space groups contain no inversion centers, mirror or glide planes, or rotoinversion axes.) Some years later, the symmetry preferences of a broader range of intermolecular interactions were determined and tabulated.308 This work showed that the shortest interactions in a structure (relative to vdw radii) most often involve (a) molecules related by inversion in centrosymmetric space groups; (b) glide-planes in noncentrosymmetric, non-Sohncke groups; and (c) 21 screw axes in Sohncke groups. Chiral substances can crystallize only in Sohncke groups. Nevertheless, a substantial minority (about 20%) of organic structures in these groups were found by Fábián and Brock to be meso or otherwise achiral.309 Pidcock found that achiral molecules are more likely to crystallize in Sohncke groups if they are rigid.310 The main focus of the Fábián and Brock study was on kryptoracemates, which are racemic crystal structures in Sohncke groups (i.e., the enantiomers in such structures are not related by crystallographic symmetry). Kryptoracemates are rare; only 181 were found in the organic part of the CSD. In most cases, the conformations of the enantiomers were very similar, and there was often an approximate symmetry element (inversion center or glide plane) between them. The CSD was used in recent investigations into Wallach’s rule, which states that racemic crystals are denser than their homochiral counterparts, with the added implication that they are also more stable. It was concluded in 2012 that the rule does not apply to amino acid structures; indeed, the largest density difference found in this study, for glutamine, was in the opposite direction.311 Conversely, subsequent analysis of 279 racemic/homochiral pairs found the racemic crystals to be, on the whole, more stable and more dense. However, energetic factors were concluded to be insufficient to account for the predominant formation of racemic crystals from racemic solutions.312 This was ascribed instead to kinetic factors and the statistical predominance of racemic over enantiopure aggregates during crystal nucleation. A more recent study used DFT-D to compute the thermodynamic stability of racemic/ homochiral pairs.313 The homochiral phase was calculated to be more stable in 19% of cases, a surprisingly high value. Furthermore, this was held to be a lower bound, because the data set was limited to cases where a stable racemic phase existed. It was concluded that spontaneous resolution by preferential crystallization of enantiomers may be more applicable than previously assumed, provided, however, that this is driven by thermodynamic factors. 2.3.4. Z′. The symbol Z′ denotes the number of formula units in the asymmetric unit of a crystal structure. For example, if the crystal structure of a salt of formula A2+·2(B−) has Z′ = 2, it will contain two symmetry-independent A2+ ions and four symmetry-independent B− ions. Structures with Z′ > 1 are in a minority and have attracted considerable research in the past

containing similarly shaped molecules. For each set, the packing arrangements were characterized and clustered. A broad correlation was found between the packing arrangements and whether a molecule was best described as a rod, disk, or sphere. It is precisely because close packing is so important that a study by Kaźmierczak and Katrusiak is so intriguing.299 They found the ∼450 organic structures in the CSD that contain no intermolecular contact shorter than the sum of vdw radiisocalled “loose” crystals. The loosest of all was bis(trichlorosilyl)acetylene, in which the shortest contact was longer than the sum of vdw radii by over 0.25 Å (Figure 14). Suggested reasons included low electrostatic forces between molecules and mismatches between the requirements of close packing and directional interactions.

Figure 14. Unit cell of bis(trichlorosilyl)acetylene (CSD entry WILWUJ), a structure in which the shortest intermolecular contact exceeds the sum of vdw radii by over 0.25 Å, making the packing exceptionally “loose”.

It has long been known that two molecules differing by only one terminal group sometimes form very similar crystal structures.275 The extent to which this is generally true was assessed by investigating some 125 000 of these “matched molecular pairs” in the CSD.300 Only about 4% of the pairs formed crystal structures similar enough to be deemed isostructural when their molecular coordination spheres were compared. The transformations with the highest degree of isostructurality were Cl → Br, Br → I, Br → CF3, and I → CF3. Some less obvious replacements, e.g. I → C≡CH, were also associated with a significant degree of structural similarity. That the rate was a mere 4% will come as no surprise to practicing crystallographers, who commonly learn that even small changes in molecular structure can result in greatly differing crystal structures. A fascinating example of this was published recently by Görbitz, who reviewed the structures of the 25 nonpolar side-chain (Ala, Val, Leu, Ile, Phe) dipeptides.301 While the change Val → Ile produces no major change, Leu → Ile leads to completely different structures. Unexpectedly, Leu, especially at the N-terminus, leads to incorporation of solvent, whereas Ile does not. 2.3.3. Symmetry and Chirality. The space group of a crystal structure, in other words, its symmetry, is of fundamental importance and has been a central focus of some notable CSD analyses.302−304 Of particular interest is the correlation between space group and intermolecular interactions. A classic early example was the recognition that monoalcohols often pack in high-symmetry space groups (or with Z′ > 1, section 2.3.4) in order to facilitate formation of O−H···O−H H-bonds, which can otherwise be difficult to achieve for steric reasons.305 More recently, Eppel and Bernstein looked at 44 different H-bond ring motifs and found they all had a strong tendency to form on crystalloQ

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

centrosymmetric molecule is positioned on a crystallographic inversion center. In either case, only half of the molecule is deemed to be symmetry-independent. A paper was published on the reasons why molecular inversion symmetry is occasionally wasted, i.e. centrosymmetric molecules are not positioned on crystallographic inversion centers. In essence, this is usually because of other packing features. For example, when centrosymmetric molecules are stacked, it is sometimes the midpoint of the stacked pair that lies on the inversion center, preventing each individual molecule from doing so (Figure 16).321

decade or so, much of it using the CSD. Little needs to be said here because a major review of the subject was published in 2015.314 However, some key findings are worth emphasizing. First, molecules crystallizing with Z′ > 1 tend, on average, to be smaller, less flexible, and perhaps more awkwardly shaped than those with Z′ = 1.315,316 Second, molecules that crystallize with Z′ > 1 have an unusually high tendency to form cocrystals with other molecules.317 Finally, “supramolecular synthon frustration” is an important causative factor of Z′ > 1.318 This is when symmetry-independent molecules form a motif that would normally occur across a symmetry element that, for some reason, is unavailable. For example, an enantiomerically pure sample of a carboxylic acid must crystallize in a Sohncke group, so crystallizing with Z′ = 2 allows the acid groups of the two independent molecules to form a cyclic H-bond dimer around a local pseudoinversion center. A later survey of interaction motifs in structures with Z′ > 1 provided further evidence that symmetry-independent pairs have a tendency to form interactions that are complementary to those that occur between the symmetry-dependent pairs.319 It was also found that the symmetry-independent pairs tend to be the more tightly packed. In 2016, some 284 organic CSD structures with unusually high Z′ values (>4) were extensively scrutinized by Brock (22 others were eliminated as being due to errors in space-group or unit-cell assignment).320 In about half of these structures, the symmetry-independent molecules were modulated (i.e., related by pseudotranslation, e.g. Figure 15), and in 70% they formed

Figure 16. PORPIN02, a structure containing stacked molecule pairs related by inversion symmetry, thereby preventing the porphyrin molecules from sitting on inversion centers. Reprinted with permission from ref 321. Copyright 2010 Royal Society of Chemistry.

2.3.5. Polymorphism. The CSD played a key role in two relatively recent discourses on polymorphism cowritten by Joel Bernstein, the great authority on the subject who sadly died while we were writing this Review.322,323 One was the particularly illuminating paper “Facts and Fictions about Polymorphism”. Among their many results, the authors found that neither molecular size nor flexibility correlates with a compound’s ability to adopt more than one crystal form (i.e., be polymorphic). On the other hand, chiral molecules are a little less likely to be polymorphic than achiral, and structures with H-bonding groups or Z′ > 1 show somewhat elevated levels of polymorphism. Polymorphism is commonly divided into four main types, viz. packing, synthon, conformational, and tautomeric, though they are not mutually exclusive and the concept of tautomeric polymorphism is somewhat controversial.103,324,325 CruzCabeza and Bernstein opined that polymorphs should be termed conformational only if the conformations in the different forms belong to different gas-phase potential-energy minima.323 Under this definition, they estimated that 36% of CSD polymorphs were conformational. They also identified the types of rotatable bonds whose torsion angles were most likely to vary between polymorphs. The energy differences between conformational polymorphs can be as much as 10 kJ mol−1, compared with about 4−6 kJ mol−1 for other types of polymorphs.322 Galek et al. studied the H-bond variability between different polymorphic forms.326 They found that all of the H-bond types persist in 66% of polymorphic pairs and at least half in 83%. Persistent H-bonding occurs with roughly equal frequency in packing and conformational polymorphs. Among many other interesting results in this paper, the types of H-bonds most likely to persist were identified.

Figure 15. Example of a 6-fold pseudotranslation in NAHCOQ (P21/ n, Z′ = 6). Symmetry-independent molecules are colored differently. Reproduced with permission of the International Union of Crystallography from ref 320. Copyright 2016 International Union of Crystallography.

aggregates (e.g., columns or layers) held together by strong, directional interactions such as H-bonds, often showing local pseudosymmetry. The conformations of the symmetryindependent molecules were usually similar to one another. A relatively large number of the high-Z′ structures were found to crystallize in Sohncke groups. Structures with Z′ < 1 commonly occur when a molecule sits on a crystallographic mirror plane or the midpoint of a R

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 17. An example of the versatility of water in packing: motifs observed in hydrates of carboxylic acids with no additional H-bonding groups. Reprinted from ref 332. Copyright 2010 American Chemical Society.

included quasiracemates (e.g., R-2-bromo- and S-2-chloro derivatives) and certain diastereomer pairs (e.g., when a 1,1 switch of Me and H in one diastereomer would make it the enantiomer of the other). These represent systems for which fractional crystallization, that ubiquitous method of purification, failed. A large number of them were quasiracemates, suggesting that inversion symmetry is very favorable in crystal packing. 2.3.7. Hydrates and Solvates. Hydrated crystal structures are not uncommon. The proportion of organic structures in the CSD that contain water was found to be about 8% by Clarke et al. in 2010.332 However, a later study found that the percentages inferred from the CSD can be gross underestimates when compared with the results of experimental screening that includes deliberate exposure to humidity and slurrying in water.333 This is hardly surprising and similar to the position with polymorphs: the more effort that is expended, the more hydrates (or polymorphs) are likely to be found. What is very clear from the CSD is the great diversity of H-bonding roles that water molecules can play (e.g., Figure 17).332−337 Indeed, Clarke et al. wondered whether the promiscuity of water might make crystalline hydrates the nemesis of crystal engineering. Considerable efforts have been made to understand what causes incorporation of water when some molecules crystallize, but the matter is still unclear. One possible factor is the balance between H-bond donors and acceptors. This may be relevant for particular types of molecules, but whether it is always important is uncertain.106,333,338 In a series of bile acids, incorporation of solvent molecules (including water) was ascribed to the packing problems of the unsolvated forms (low packing index, voids, and unsatisfied H-bonds), all of which were due to one particular −OH group.339 The relevance of awkwardly shaped molecules has been suggested.315 Structures containing solvent molecules other than water (e.g., dimethyl sulfoxide, chloroform, dichloromethane, and methanol) have been studied in recent years.340−343 Takieddin

There has been interest in the relative likelihood of different types of structures (cocrystal, salt, hydrates, etc.) to be polymorphic,327 but the reliability with which this can be inferred by analysis of the CSD is questionable. Crystal structure determinations are generally performed for reasons other than an interest in polymorphism, so the fact that the CSD may contain only one form for a substance means little. Research into polymorphism in an industrial context is covered in section 4.1.2. 2.3.6. Cocrystals. The intense industrial interest in designing cocrystals to solve formulation problems (section 4.1.3) has been accompanied by some interesting studies of their fundamental nature, several of which made use of the CSD. Taylor and Day applied periodic DFT to calculate the stabilities of 350 organic cocrystals from the CSD, which were compared to those of the corresponding single-compound (“co-former”) structures.328 This showed that neither density nor the presence of H-bonds or halogen bonds are necessarily good guides to relative stability. The cocrystal structures were found to be more stable, on average by about 8 kJ mol−1, than those of their coformers, being less stable in under 5% of cases. Gavezzotti et al. similarly concluded that the lattice energy of a binary cocrystal structure is almost always more stabilizing than the sum of the lattice energies of the corresponding coformer crystal structures.329 In addition, the cocrystals are more likely to be in centrosymmetric space groups. The large majority of cocrystals contain H-bonds between the coformers, though we note that this may sometimes be a consequence of deliberate crystal engineering. Co-former sizes can vary greatly. Cocrystals that lack H-bonds usually contain stacked heteropairs, e.g. aromatic molecules with reverse polarizations such as CnHn and CnFn.329,330 There was a suggestion that kinetic effects may be important in the formation of this type of cocrystal. A 2011 paper focused on 270 cocrystals of isomers or near isomers, excluding structures in which heteropairs were selfevidently favored (e.g., acid−base complexes).331 The data set S

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

that “proteins don’t strain ligands, protein crystallographers do”353). In the first of an outstanding series of perspective articles by Roche scientists, Brameld et al. discussed the value of the CSD for investigating conformational preferences.350 As part of their analysis, they examined the coverage of drug-relevant chemical space that the CSD and PDB offer. Having identified over 2000 molecular fragments occurring in compounds that have progressed to Phase I clinical trials, they determined that 37% occurred in an organic subset of the CSD. Not especially impressive, perhaps, but the coverage rose to 91% for the fragments that had occurred in at least five Phase I compounds. They concluded that, while not exhaustive, the coverage is sufficient to be useful, particularly in the chemical space most often used by medicinal chemists. The coverage provided by the PDB was far poorer (we note, however, that these figures are now 10 years out of date). They then gave a discourse on using CSD-derived torsion distributions to gain insights into the conformational properties of some of the fragments commonly seen in pharmaceuticals. Finally, they concluded that torsion information from the CSD can provide useful hints for designing molecules with particular shapes. A more recent review on conformational control in structure-based drug design concurred.354 Another article looked at intramolecular H-bonds, which can have a significant influence on conformations and physicochemical properties.28 The CSD was searched for pharmaceutically relevant H-bonded rings of size 5−8. The effects of some of these H-bonds on physical properties (e.g., log D355) were investigated by synthesis and testing of model compounds and found to be very variable. An important factor is whether the intramolecular H-bond is maintained in solution. The information acquired in the study should facilitate rational deployment of intramolecular H-bonding to influence conformations, solubilities, and membrane permeabilities. Here are some illustrative applications of the CSD to drug and agrochemical discovery: (a) Selection of Linking Groups. Safina et al. sought to modify a known benzoxazepin inhibitor of the kinase PI3Kδ.51 Their aim was to add a group such as t-butylpiperazine that would form a cation−π interaction with Trp760, which was expected to improve selectivity. It was unclear what linking group should be used to connect the piperazine moiety to the benzoxazepin scaffold. Searching the CSD determined that the torsional preferences of various possible linkers were very different (e.g., Figure 18). An assortment of molecules was synthesized, varying only the linking group. The resulting selectivity data were as expected from the CSD torsional preferences. Further optimization led to a potent and selective inhibitor. In another example, the CSD was used to identify tertiary amide as a conformational mimic of cyclohexyl-bound sulfonamide. Tellingly, the authors described molecular geometries in the CSD as “real” conformations.356 Wuitschik et al. discussed the use of oxetane to replace functionalities such as carbonyl and gem-dimethyl,357 thereby modifying physicochemical properties and conformations. When discussing the latter issue, they found by CSD analysis that the incorporation of an oxetane is very likely to cause the chain on which it is grafted to adopt a gauche conformation, whereas carbonyl favors the antiperiplanar geometry and gem-dimethyl favors all three staggered conformations about equally. (b) Rationalizing Inactivity. A group of Roche scientists were working on tissue factor/factor VIIa (TF/F.VIIa) benzamide

et al. derived knowledge-based models to predict hydrate and solvate formation using 19 000 CSD structures.344 Molecules that formed solvates were compared with those that did not, using assorted descriptors, e.g., molecular size, branching, and H-bonding ability. The best models achieved about 80% prediction accuracy. Very recently, Xin et al. published a similar study using CSD-based machine learning and targeted particularly at drug-like molecules. They looked at nine different solvents and achieved up to 86% prediction accuracy of solvate formation with models based on 2D descriptors and random forest and support vector machine algorithms.345 These models have potential industrial relevance (section 4.1.3).

3. DESIGN OF BIOLOGICALLY-ACTIVE MOLECULES Use of the CSD to aid drug and agrochemical discovery began in the 1980s. It initially attracted the interest of a new breed of drug designers who had state-of-the-art graphics terminals and wished to learn about molecular geometries and interactions. Later, it found a role as a provider of data needed for the development of novel drug-design software, such as LUDI346 (de novo inhibitor design) and DOCK347 (protein−ligand docking). Its use by both communitiesdrug designers and client applicationscontinues unabated. The database serves three basic purposes: it provides information about molecular shapes and about molecular recognition and it serves as a highly diverse 3D chemical database. 3.1. Molecular Shapes

The prediction of energetically accessible molecular geometries is of central importance in rational drug design. Modern force fields and quantum mechanical calculations are routinely used, but the CSD has the virtue of containing precise, experimental information, something that is always likely to appeal to hardbitten medicinal chemists. Further, it provides information about molecular conformations in a condensed phase. It is generally believed that protein ligands tend to adopt rather extended conformations, allowing them to maximize interactions with the binding site.348,349 Similarly, favorable interactions such as H-bonds in CSD crystal structures tend to be intermolecular rather than intramolecular.323 In contrast, theoretical in vacuo energy calculations are biased toward folded conformations, which enable favorable intramolecular contacts to be made (section 2.1.4). Nevertheless, small-molecule crystals are not the same as hydrated protein−ligand complexes and there is long-standing uncertainty about the degree to which crystal-packing forces bias conformations. The possibility of bias was discussed in a previous CSD review and dismissed as a serious concern.8 It was concluded that crystal packing forces had been known to bias conformations, but only rarely. Since then, that view has been reinforced by further analyses.64,350 The only situation in which it is clear that CSD conformations can be unusually strained is when molecules are situated on inversion centers (section 2.1.4). This is the case for the CSD entry DCTXAN, whose conformation could not be reproduced by the conformer-generator program OMEGA.351 Torsion distributions derived from protein ligands tend to have broader peaks than those based on CSD structures.58 This might indicate that strained conformations are sometimes tolerated when small molecules bind to proteins, but it could also be ascribed to the lower precision of protein structures and occasional gross errors in ligand geometries352 (it has been caustically suggested T

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(c) Ring-Substituent Geometry. The crystal structure of a fatty acid binding protein complexed with a quinoline inhibitor suggested that specificity could be improved by occupying a pocket near the quinoline 2- position. One possible substituent was N-linked piperidine, but the quinoline would need to be pseudoaxial on the piperidine ring. Searches of the CSD provided reassurance that this was feasible, encouraging synthesis of a potent and selective inhibitor.358 (d) Docking-Solution Validation. Crocacin A is a natural fungicide that binds to the same site (cytochrome bc1) as the commercially important strobilurins. It attracted interest because it was not cross-resistant to strobilurin-resistant strains.359 However, its poor photostability needed attention. Protein−ligand docking was used to fit crocacin A into the cytochrome bc1 binding site, but many possible solutions were produced because the molecule is highly flexible. CSD torsionangle distributions, supplemented by force-field calculations, were used to identify the likely torsion angles of each of the 13 rotatable bonds, which enabled one of the docking solutions to be identified as the most likely to be correct. The chosen docking guided the design of active crocacin analogues, some having much improved photostability and one of which was shown crystallographically to bind in a very similar way to that of the chosen docking solution. In similar vein, Tatum et al. used Mogul torsion distributions to filter out docking solutions with unusual torsion angles.360 (e) Intramolecular H-bonding. Furet et al. sought to find a new scaffold for tyrosine kinase inhibitors by replacing a heterocyclic ring in a known inhibitor by a pseudo ring formed

Figure 18. CSD-derived distributions showing the very different torsion-angle preferences of four different linking groups (torsion angles shown in red). Reprinted from ref 51. Copyright 2017 American Chemical Society.

inhibitors. They designed and synthesized two compounds that they thought should bind in the S1 and S3 pockets, and also to the S2 pocket specific to F.VIIa.350 Neither was active. The molecules contained the fragment C(sp2)−C(CH3)−CO− NH−, in which the alkyl group and amide nitrogen needed to be approximately coplanar for the anticipated binding to occur. CSD analyses showed that the geometry was very unlikely but would be favorable in the related fragment C(sp2)−C(O)− CO−NH−. This finding led to an 11 μM inhibitor, giving entry to a new class of inhibitors (Figure 19).

Figure 19. Compounds 1 and 2 were found to be inactive as inhibitors of TF/F.VIIa because their preferred conformations do not fit the active site. The mandelic acid analogues (4 and 5) have different conformational preferences and were found to bind. Reprinted from ref 350. Copyright 2008 American Chemical Society. U

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

by an intramolecular H-bond.361 Before synthesizing their target compound, they searched the CSD to find out how likely it was that their planned H-bonded ring (in a pyrimidin4-yl-urea grouping) would form. All seven of the CSD molecules containing this substructure formed the required H-bond. The target compound was synthesized and found to be active at the submicromolar level. (f) Torsional Barriers. Loeffler at al. investigated the geometry of N,N′-disubstituted ureas, which are prominent in medicinal chemistry.362 CSD searches showed that they have a strong preference for the trans, trans conformation, along with a minor preference for cis, trans. The peaks in the torsion distributions are very tight, with no intermediate geometries ever observed. The authors therefore inferred that the barrier between the two conformers is likely to be high and quite possibly insurmountable in molecular dynamics simulations, an inference they subsequently proved. Accurate modeling therefore requires exploration of both conformers independently. So far, we have focused exclusively on direct uses of the CSD. However, a great deal of work has been done in using the CSD to aid the development of novel drug-discovery software. At its simplest level, this merely involved using the CSD for parametrization or validation. For example, some parameters in the popular MMFF and OPLS force fields were chosen with reference to CSD data.363,364 More recently, the COMPASS II force field and a CHARMM-compatible lignin force field were validated against structures from the CSD,365,366 and CSD geometries were used to find limitations in OPLS3.367 Conformer generation is a common requirement in drug design, i.e. predicting as many as possible of the conformations that a molecule can feasibly adopt. Many programs for this purpose have been written in the past decade or so, and the CSD has played a very significant role. Its simplest use is for validation; when run on a molecule from the CSD, a conformer generator should produce at least one conformation close to the crystallographically observed geometry. While the PDB is the most common source of test molecules, the CSD is also used very frequently because its experimental precision is much better.57,368−371 More interesting, though, is when the CSD plays an intimate role in the conformer generation. For example, distance geometry is a fast methodology for generating conformers but can easily produce solutions with obvious errors, such as nonplanar aromatic rings. In the ETKDG program, Riniker and Landrum therefore coupled distance geometry with a minimization step based in part on torsion potentials derived from CSD distributions.371 CSD data play an even bigger part in three recent conformer generators: CONFECT, 372 BCL::CONF,61 and the “CSD Conformer Generator”.57 These programs share the same basic methodology. The molecule to be processed is divided into fragments. Conformations are built up by joining the fragments together incrementally, using torsion distributions from the CSD to select rotatable-bond geometries with acceptable probabilities of occurrence. Each resulting conformer is scored by combining the individual probabilities together; an atomclash scoring term or minimization step is also needed to avoid unacceptable nonbonded contacts. The underlying torsion libraries were briefly discussed in section 2.1.3. The scoring function of the CSD Conformer Generator was mentioned in two interesting papers. In one, it was noted that the scores do not correlate well with calculated conformer

energies (this is no surprise as, e.g., long-range electrostatic interactions will influence only the latter).52 In the other, the scoring function was one of several used to rank conformers on a variety of different properties, it being argued that this “pluralistic ranking” was more useful than ranking on one criterion alone.373 Other drug-discovery applications in which CSD conformational data have found a use include pharmacophore elucidation374 and protein−ligand docking. A recent example of the latter is WScore, a program distributed by Schrödinger, Inc.375 3.2. Molecular Recognition

Inferring the likely interactions between a putative ligand and the binding site for which it is intended is obviously fundamental to structure-based drug design. As we have seen, CSD analyses have made an enormous contribution to our collective understanding of molecular interactions. This in itself is an important contribution to drug discovery, even if not always consciously recognized. In this section, however, we discuss only overt uses of the CSD and the derived knowledge base IsoStar376 and programs whose implementation has depended on CSD data. We also avoid the topics covered in the excellent “A Medicinal Chemist’s Guide to Molecular Interactions” of Bissantz et al.201 Despite the undoubted value of the CSD, two issues should be borne in mind when close nonbonded contacts in smallmolecule structures are used to predict what might occur between proteins and ligands. First, contacts across crystallographic inversion centers are often favored by crystal packing and may not always be good models for interaction likelihoods in chiral environments (section 2.2.3). Second, the ratio of Hbond donors to acceptors is, on average, appreciably higher in proteins than in small-molecule crystal structures, so H-bonds to weak acceptors are more likely to form in protein−ligand complexes than in CSD structures (section 2.2.1). IsoStar is an intermolecular-contacts knowledge base derived from the CSD. The scatterplots stored therein reveal quickly and easily how common functional groups interact. For example, the left-hand plot in Figure 20 shows the spatial

Figure 20. CSD-based IsoStar plots of (left) phenyl groups around oxazole and (right) N−H and O−H around 1,3,4-thiadiazole. In the right-hand plot, each contact is shown twice, the extra one being generated by reflection in the mirror plane of the thiadiazole moiety perpendicular to the plane of the figure.

distribution of close phenyl···oxazole contacts in the CSD, overlaid so that the oxazole ring is placed in a central reference position. It is clear that the two rings often stack (individual examples could be viewed by hyperlinking to structures contributing to the plot). The right-hand plot is the distribution of OH and NH groups around 1,3,4-thiadiazole. It shows that (a) the nitrogen atoms are good but directional H-bond acceptorsit is rare for donor groups to point to the V

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 21. Example enzyme with atomic SuperStar propensities (left), propensities weighted by degree of burial (center), and fragment hotspot map (right). Yellow, hydrophobic map; blue, H-bond donor; red, H-bond acceptor; magenta, reference ligand. Reprinted from ref 383. Copyright 2016 American Chemical Society.

Figure 22. Motifs in the crystal structure of a potent GPR119 agonist. The head-to-head motif between methylsulfone groups occurs very frequently in the CSD and was deduced to be a key factor in increasing lattice energy and thereby reducing solubility. Adapted from ref 387. Copyright 2012 American Chemical Society.

profitably be added. SuperStar maps using methyl and hydroxyl probes were used to identify an appropriate position for this functionality. This was followed by docking studies to decide the exact nature of the side chain. Good inhibitors with the desired physicochemical properties were obtained. Other example SuperStar applications include its use to support hypothesized docking poses381 and to identify structural waters in protein.382 SuperStar indicates where single interactions can occur, not where complete fragments can be positioned. A modified algorithm has been developed to address this shortcoming.383 Three SuperStar maps are generated to indicate favorable grid points for carbonyl oxygen, uncharged NH, and aromatic CH. Deeply buried grid points with high SuperStar scores are then examined. The aim is to determine whether there is sufficient room around the point for the binding atom and an attached hydrophobic fragment, and, if so, in which orientation(s) the fragment can lie. In this way, fragment hot-spot maps are produced. As Figure 21 shows, they are much more discriminating than the original SuperStar maps. A recent application of the program was published, directed at inhibition of acetylcholinesterase at allosteric sites.384 Metal coordination is sometimes relevant in drug discovery, and the CSD is the obvious source of information about coordination geometries. Its most common use is to assist in

middle of the N=N bondand (b) the thiadiazole sulfur scarcely if ever acts as an acceptor (only contacts shorter than the sum of vdw radii minus 0.4 Å are shown in the plot; the complete distribution contains 1316 contacts of which no more than two or three could reasonably be interpreted as possible H-bonds to the sulfur atom). The distributions can be converted to contoured interaction−density plots. The main limitation of IsoStar is its effective restriction to a precalculated set of scatterplots. Customized scatterplots can be added, but the procedure is tedious and doubtless deters most users. Nevertheless, the system has many uses in drug discovery. One particularly novel application occurred in an investigation of HIV-1 integrase mutations that cause resistance to the drug raltegravir.377 Inspection of IsoStar scatterplots helped the authors to conclude that the mutations change the DNA bases recognized by the viral enzyme, which may be relevant because raltegravir might act as an adenine bioisostere. SuperStar is a program for identifying binding hot-spots in protein cavities by combining relevant interaction−density plots from IsoStar.378,379 An example of its use is given in a paper by Ruf et al.380 They had a crystal structure of cathepsin A complexed with a ligand of reasonable potency but suboptimum physical properties. In order to modify log D but avoid induction of cytochrome P450 3A4, it was decided that a nonpolar side chain bearing a hydroxyl group could W

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

designed to find new leads for drug discovery projects. 3D searches of the CSD also have the advantage that the molecular conformations therein are proven to exist. There is, however, an attendant disadvantage: a CSD search might produce interesting hits but physical samples are unlikely to be available for testing. Therefore, routine searches aimed at finding compounds to assay are preferentially performed on corporate compound collections. The diversity of the CSD can still be exploited, but only in projects where synthetic effort is available. Despite this limitation, searching the CSD still has many uses, as the following examples will illustrate. The relevant database search techniques go under different names: pharmacophore searching, scaffold hopping, searches for bioisosteres or linker groups, shape matching, and so on. However, these methods overlap, and it can sometimes be difficult to assign a given search to its appropriate classification. Our first example, however, is clearly a 3D pharmacophore search. Fry et al. decided to design and synthesize small compound libraries targeted at protein−protein interfaces (PPIs).390 This was felt necessary because conventional compound libraries had been found to give unusually low hit rates in high-throughput assays. One common PPI involves the interaction between an α-helix in the first protein and three lipophilic pockets in the second. A PPI crystal structure was used to set up a 3-point pharmacophore to represent the latter. The CSD was searched for this pharmacophore. The hits were used to identify structural types that might inhibit PPIs and hence should be included in the libraries. Several libraries were designed in this way, while others were based on in-house expertise gleaned from previous PPI projects. The libraries were used for a variety of PPI assays. While some produced nothing useful, others gave highly diverse hits with activities in the micromolar range. There was no systematic difference in the success rates of the two approaches. Our second example is effectively a 2D pharmacophore search, aimed at the de novo design of fragments for binding to hepatitis C virus nonstructural 5B polymerase.391 The starting point was a crystal structure of the enzyme with a bound fragment that had been found by screening. Attempts to improve potency by modifying this fragment failed, but the crystal structure enabled binding points on the enzyme to be identified, including a pair of H-bonding sites. In order to target these sites, the CSD was searched for ring systems containing an NH donor and a N or O acceptor separated by 2, 3, or 4 bonds. The hits were docked, subject to constraints; the results manually examined; and targets identified for synthesis and iterative refinement. This resulted in several hydantoin and pyridone derivatives active at the micromolar level. The CSD can be a useful database for suggesting bioisosteres. For example, a known series of fructose-1,6biphosphatase inhibitors contained an indazole moiety, and novel analogues were sought. Unspecified CSD searches found a successful bioisostere of the indazole ring system, viz. a ureasubstituted pyridine containing an intramolecular H-bond.392 Another example was discussed by Sun et al.393 The 3D structure of a known inhibitor of Bcl-xL (B-cell lymphoma extra large) contained two phenyl rings. In its protein-bound conformation, these rings were stacked with each other, and each was also stacked with an aromatic protein side chain. A 3D search of the CSD found a molecule with a very similar intramolecular phenyl−phenyl stacking arrangement but a

the modeling of metal coordination spheres in protein crystal structures. In a rather unusual example, Brink and Helliwell performed crystallography with tunable synchrotron radiation to investigate ways in which fac-[Re(CO)3(H2O)3]+ can bind to hen egg-white lysozyme. 385 The relevance is that rhenium188/186 tricarbonyl complexes are potential therapeutic and diagnostic radiopharmaceuticals. Binding to several side chains was found to occur (e.g., Asp, Glu, Arg, and His) and in a variety of modes (mono-, bi-, and tridentate). Extensive searching of the CSD was used to assist the interpretation of the crystallographic results (e.g., to identify coordination denticity). The diversity and geometries of the amino-acid binding sites found for the metal ion should assist the development of Re/Tc compounds for site-specific binding to proteins in radiopharmaceutical applications. The CSD played an essential part in the development of an algorithm for predicting water molecules in protein−ligand interfaces, particularly those that bridge between polar groups on the protein and ligand. Its role was to identify appropriate geometrical criteria for the intermolecular interactions of water.386 A promising use of CSD interaction data is to suggest ways in which aqueous solubility can be increased, a recurrent requirement in drug design. A notable example was published by Scott et al. of AstraZeneca.387 They examined the molecular interactions in the crystal structure of a potent but inadequately soluble GPR119 agonist. Observing a head-tohead intermolecular interaction between aryl-bound methylsulfone groups (Figure 22), they searched the CSD and found that this was a very common motif. The obvious conclusion was that this type of packing arrangement contributes substantially to lattice energy, thereby reducing solubility. Replacing aryl methylsulfone with a cyanopyridyl group retained potency and increased solubility by 2 orders of magnitude. Another interesting solubility-related result emerged from a matched molecular-pair analysis of 1,2,4- and 1,3,4-oxadiazole compounds taken from a corporate compound collection (i.e., comparison of pairs of molecules that differ only in the oxadiazole isomer).388 In the majority of pairs, the log D values differed very significantly, the 1,3,4 isomers being less lipophilic, but the difference in aqueous solubilities was often within experimental error. In other words, the expected inverse relationship between these parameters broke down. It was inferred that the 1,3,4 isomers might pack with greater lattice energies, hence reducing their solubility. In support of this idea, 1,2,4-oxadiazoles are more common than 1,3,4 in both the patent-application and SciFinder databases, but the reverse is very much the case in the CSD. Therefore, 1,3,4 isomers probably find it easier to crystallize, consistent with the idea that they form more stable packing arrangements. 3.3. The CSD as a Diverse Chemical Database

It is very hard to judge reliably the diversity of a chemical database. Molecular similarity is an ill-defined concept,389 so molecular diversity is too. We often feel that the results of database diversity analyses say more about researchers’ criteria than about database content. Nevertheless, we will fall into the same trap and declare that the CSD is very diverse. Our reason is simple: if a synthetic chemist makes an interesting or unusual compound, you can bet that he or she will ask for its crystal structure to be solved. So the CSD is replete with novel molecules and can be an attractive database to use for searches X

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

or both of the CSD and PDB. It offers a particularly extensive range of options, including searching for pharmacophore features, ligand scaffolds, excluded regions of space, and exit vectors. Crucially, features can be in the same or different molecule(s), so searches can include intermolecular contacts, either between small molecules or between ligand and protein.400 Two other interesting software developments, both shape-matching algorithms, were reported by Awale et al. and Spackman et al.401,402

different link between the phenyl groups. Use of this new link led to a novel inhibitor. Further examples of bioisosteric replacements were discussed by Groom et al.394 The SPARK scaffold-hopping program distributed by Cresset can search a CSD-derived fragment library for bioisosteres by matching steric and electrostatic fields.395 Other scaffold-hopping programs that can search fragment libraries derived from the CSD are available from Chemical Computing Group396 and BioSolveIt (ReCore397). Two examples of the use of ReCore were described by Kuhn et al.358 The aim of the first was to find novel inhibitors of βtryptase. A crystal structure of the enzyme complexed with a known inhibitor indicated that groups occupying the S1 and S4 pockets were linked by a moiety that had little contact with the protein. A search for alternative linking groups was performed. Incorporating one of the groups thereby found into the original inhibitor gave a novel compound with an inhibition constant (Ki) of 50 nM. A subsequent crystal structure confirmed that the S1 and S4 pockets were occupied as before, but the new linker took a different route, enabling it to form an excellent Hbond to a protein side chain. Part design, part luck! In the second example, the researchers wished to replace a metasubstituted phenyl ring in the center of a BACE1 inhibitor with a more polar linking group. The objective was to improve physicochemical properties. A ReCore search found a transcyclopropylketone linker in CSD entry FUQGAZ that looked an almost ideal geometric match. This led to a slightly more potent inhibitor with lower log D and improved aqueous solubility (Figure 23). Also connected with scaffold hopping,

4. EMERGING APPLICATIONS 4.1. Challenges in Drug Development

Selection and characterization of the drug solid form to be commercialized are key steps in the pharmaceutical development process. The specific solid form chosen can affect many important physical properties, such as solubility, bioavailability, and stability,403 but the ability to robustly control the solid form, avoiding any phase transformations, is also critical for regulatory approval.404 While theoretical and knowledge-based modeling has been established for many years in the drug discovery field, it is only relatively recently that these approaches have started to become established in drug development.405 Over the past decade the field has developed from simply analyzing observed solid forms to assessing the risk of polymorphism, designing new solid forms, and predicting the macroscopic behavior of solid forms. 406 The development of knowledge-based approachessolid form informaticshas progressed alongside, and been driven forward by, the evolution of pharmaceutical materials science in industry. 4.1.1. Interactions and Packing. In the early years of focused knowledge-based applications in the drug development arena, the starting point was simply to utilize emerging tools for searching and visualizing crystal structure features to better understand each crystal structure. This field was catalyzed in 2002 by the creation of the Pfizer Institute for Pharmaceutical Materials Science (PIPMS),407 a collaboration between Pfizer Global R&D, the University of Cambridge, and the CCDC which, among many other outputs, delivered a range of new knowledge-based software tools. These new approaches, developed within the popular crystallographic visualizer Mercury,408 allowed more sophisticated options for searching motifs and packing features as well as assessing crystal packing similarity.409,410 The first of the new methods to be implemented, MotifSearch, allows fast searching of the CSD for topological patterns of intermolecular connections. This makes it easy to compile and compare statistics about how common a set of interaction networks, for example H-bonding patterns, are within the database. Haynes et al. illustrated the use of this approach to study the H-bonding of sulfonate salts and identified that there is a particularly robust R(2,2)8 ring motif that occurs with a probability of about 75% in the CSD.411 Fábián et al. similarly used this analysis method to study hydrophobic amino acids and again picked out frequently occurring motifs such as the very common R(6,6)26 ring motif, found in 85% of the structures, and the smaller R(3,4) (14) motif.412 Going beyond searching on connections, which ignores the geometry of the interactions, a further method, Packing Feature Search, was developed to search for and visualize the geometry of features. This approach is extremely flexible and

Figure 23. Original BACE1 inhibitor (19) and a more polar analogue (20) containing a replacement fragment suggested by a CSD entry. Adapted from ref 358. Copyright 2016 American Chemical Society.

two recent CSD analyses examined the exit vectors (i.e., directions of the attachment bonds) of various disubstituted saturated rings. It was discovered that such rings do not necessarily provide an effective way of introducing threedimensionality into a molecule.398,399 We will finish this section with some recent software developments. CSD-CrossMiner can be used to search either Y

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

surprisingly, this study illustrates that it is quite uncommon for sodium and potassium salts of the same compound to be isostructural and, even when the structures are very similar, the physical properties can be very different. It also showed that comparison of 3D packing arrangements is needed to determine similarity or isostructurality; comparison of unit cell dimensions or powder diffraction patterns may be insufficient. Toward the end of the PIPMS collaboration, the baton for driving forward the progress of knowledge-based methods for solid form development was taken on by the newly developed Crystal Form Consortium (CFC), which was established in 2008.425 This highly successful consortium gathers the input and insight of the leading pharmaceutical and agrochemical companies to provide direction for continued development. The CFC has now been running for 10 years and has delivered many of the approaches and research studies discussed in the following subsections. 4.1.2. Risk of Polymorphism. Control of the solid form, as mentioned earlier, is a crucial part of the regulatory process for pharmaceutical products. There are a number of wellknown examples of the impact on both patients and drug companies of the surprising appearance of new polymorphs late in the process, including the HIV drug ritonavir9 and the Parkinson’s drug rotigotine.426 Solid form informatics approaches have progressed beyond analysis of individual atom- or functional group-based aspects of a crystal structure, such as measuring individual H-bonds, or looking at how unusual a given torsion angle is, to more complete molecular assessments. The latest approaches now provide assessment of complete molecular geometries,57 binding hot-spots around molecules (full interaction maps, FIMs),427 and analysis of complete Hbonding networks (hydrogen bond propensities, HBPs).108,428 The FIM method is closely analogous to that of SuperStar (section 3.2). The HBP method uses CSD data to estimate the likelihood of a H-bond between each possible combination of the donor and acceptor groups in one or more target molecules. Hence, it can identify polymorphs that may be metastable because they contain low-probability H-bonds. Originally developed for intermolecular interactions, HBP can, after suitable modification, also be applied to intramolecular Hbonds.429 Most major pharmaceutical companies now apply these sophisticated solid form informatics methods together, on every drug development candidate, to produce a full picture of the risk of polymorphism. Galek et al. in 2012 first illustrated this principle of combining the analysis of all aspects of a crystal structure togetherincluding molecular geometry, intermolecular interactions and supramolecular packingto give a full picture for the pharmaceutical compound lamotrigine, the half millionth structure added to the CSD.430 For lamotrigine, the overall assessment indicated that the pure solid form of the compound looked stable, but that there was a lot of potential for cocrystallization. This appears to have been a reasonably accurate prediction as there are now 10 cocrystals of lamotrigine in the CSD, half of which were published after the 2012 Galek paper, plus 13 solvates and 59 salts, but still no new pure polymorphs of the compound. In subsequent years, Abramov,431 Ismail et al.432 and Price et 433 al. demonstrated the value of using knowledge-based assessment alongside experimental polymorph screening and/ or purely computational approaches like CSP for axitinib and

can be used to search for any 3D arrangement of atoms, whether intramolecular, intermolecular, or across more than two molecules. Pogoda et al. showed how the approach could be applied to intramolecular geometries;413 it helped them distinguish the relative prevalence of syn−anti−anti−anti and syn−anti−syn−syn conformations within a data set of semicarbazones, the latter proving predominant. Utilizing the method for an intermolecular search, Johnston et al. determined the frequency of occurrence of four common stacking interaction motifs within a crystal structure prediction (CSP) landscape414 of chlorothiazide structures.415 Maloney et al. highlighted the ability of the Packing Feature approach to investigate any kind of interaction by comparing methyl− methyl interactions in the structures of the primary amines.416 The ability to easily visualize and determine quantitatively the crystal packing similarity410 of a whole family of crystal structures was first developed to aid filtering of CSP landscapes.417 The similarity method avoids using space group information, relying instead on the comparison of molecular packing environments. The approach was subsequently found to be very helpful in understanding families of polymorphs, cocrystals, solvates, hydrates, and salts of a given compound. Pharmaceutical compounds are prone to generating a very diverse solid form landscape, often including all of the structural types just mentioned, and it can be a daunting task trying to make sense of the structural relationships. Recent packing similarity studies have been performed on carbamazepine structures418 (Figure 24), salt forms of methylephe-

Figure 24. Three molecular pair arrangements common in structures containing carbamazepine and found by use of packing similarity software. Reprinted from ref 418. Copyright 2009 American Chemical Society.

drine,419 salt forms of tyramine,420 cocrystals of meloxicam,421 solvates of trospium chloride,422 and structures of vitamin D analogues.423 In every study, the approach was easily able to pick out cases of isostructurality, as well as families of structures with closely related packing features that may be related to properties or behavior of the crystals. This kind of packing analysis across families of up to 50 or more different crystal structures would be impossible to perform manually, and the results are typically very enlightening in relation to the solid form behavior. The methods are not restricted simply to multicomponent forms of a given drug molecule, as Wood et al. illustrated in a CSD-wide analysis of isostructurality (or lack of) in Na+ and K+ salts of organic molecules.424 Perhaps Z

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

donors and acceptors, as well as the robustness of synthons in a H-bond-focused cocrystal design approach. An orthogonal, and complementary, approach makes predictions of cocrystallization outcomes on the basis of molecular descriptors, rather than H-bonding.439 This method, based on the principle that cocrystallizing molecules tend to have similar molecular descriptors (in particular, shape and polarity), has been applied successfully in the cases of artemisinin440 and diflunisal441 to narrow a large list of possible coformers by removing those unlikely to cocrystallize. A further study by Karki et al. showed how the use of molecular descriptors helped find an appropriate coformer (theophylline) to generate a cocrystal of paracetamol with improved tablet compression properties.442 While being a relatively simple approach scientifically, this method is very quick to run and has been shown to be effective in reducing down a large library of potential coformers by filtering out those that are unlikely to cocrystallize. Because of its speed, the approach lends itself well to use in the initial stages of a multistage coformer screening methodology, with more computer-intensive analyses then being used to further rank the coformers. A H-bonding knowledge-based approach can be applied in cocrystal design either on its own or following a molecular descriptor-based filtering step. All H-bonding cocrystal design studies based on CSD data follow essentially the same basic flow but differ in how much sophistication is used in analyzing the statistics and how much user input is required. The functional groups in the target molecule and possible coformers are identified, and the robustness of the possible homomolecular and heteromolecular interactions are determined (from CSD statistics); then coformers are selected based on those which are observed to have more robust interactions with the target than the target has with itself.443 Manual searching and analysis of CSD statistics has been applied in cocrystal design by many groups in recent years. Oswald et al. used H-bond geometry data from the CSD to help rationalize the interactions observed in cocrystals of paracetamol.444 Synthon frequency-of-occurrence statistics obtained from CSD searches using ConQuest445 were used by Moragues-Bartolome et al. to understand cocrystals of ciscarboxamides with carboxylic acids281 and by Wang et al. to design cocrystals of hydrochlorothiazide.446 Mapp et al. also used manual CSD and IsoStar searches in two recent successful attempts to design cocrystals for lonidamine447 and propyphenazone,448 resulting in nine and eight new cocrystals, respectively. In each of these cases, the use of CSD statistics clarified the experimentally observed cocrystallization behavior, but setting up CSD searches and interpreting the results requires significant manual input. It is therefore helpful to automate the approach as well as to utilize more information from every structure rather than just basic frequency of occurrence counts. The HBP approach discussed in section 4.1.2 allows sophisticated H-bonding likelihoods to be determined from the CSD, considering detailed per-structure explanatory variables such as competition, steric hindrance around functional groups, and aromaticity of the molecules. Wood et al. explained how the HBP approach can be applied to cocrystal design via determination of likelihoods of all possible interactions in a theoretical cocrystal between a target molecule and a given coformer.443 They assumed that a Hbonding drive toward a cocrystal exists if the best

crizotinib (Pfizer compounds), GSK269984B (GlaxoSmithKline), and tazofelone (Eli Lilly), respectively. The general conclusions from these studies were that the knowledge-based analysis provided valuable insight into the solid form stability, alongside the experimental and theoretical information already determined. In the case of tazofelone, for example, FIM allowed visualization of the most likely interactions and indicated those interactions that were deviating from their ideal geometry, illustrating the trade-off between ideal interaction geometry and close packing. Some academic research groups have also looked at applying the HBP approach, on its own, to the assessment of the likelihood of polymorphism. Nauha and Bernstein studied the compounds bufexamac and meglumine434 and probenecid435 and noted that different polymorphs may have the same Hbonding networks, in which case they will be rated equally by HBP. Overall, however, these researchers found the method helpful in providing an indication of whether multiple forms were likely or not. A likely stumbling block in their studies would appear to be simply focusing only on the use of HBP, rather than a more holistic analysis. On its own, the HBP approach has its limitations, but these studies show that, especially when combined with other techniques, it gives a very valuable overview and understanding of the solid form, particularly around management of risk of polymorphism. This knowledge can either encourage more experimental solid form screening to be carried out or provide confidence to continue with a stable form. The solid form informatics studies described by Feeder et al. of three live Pfizer drug development compounds (maraviroc, a pain candidate, and crizotinib) show the most complete application of these methods in solid-form selection.406 In the case of maraviroc, IsoStar analysis (section 3.2) highlighted an unusual intermolecular interaction geometry; this prompted more crystallization experiments, which uncovered a new polymorph that had the more common interaction geometry. For the pain candidate, analysis of interactions, packing, and molecular geometry provided reassurance that the structure was stable, despite a slightly low-probability conformation. Finally, in the case of crizotinib, the analysis of geometry, interactions, and packing all indicated that the solid form was stable and had a low likelihood of polymorphism; alongside experimental results and CSP analyses, this allowed the drug development to progress with confidence. 4.1.3. Cocrystal Design. There is a long history of cocrystal design based on the predictability of interactions using knowledge from the CSD. There are two key fundamental concepts upon which most knowledge-based cocrystal studies are based. The first was introduced by Etter in 1990 as her third rule for organic compounds: “The best proton donors and acceptors remaining after intramolecular hydrogen bond formation form intermolecular hydrogen bonds to one another.”436 If you can determine which are the “best” proton donors and acceptors in a system, and then engineer these donors and acceptors to be on different molecules, you should, in principle, be able to predict the formation of a cocrystal. Building on this concept, the second key principle was introduced by Desiraju in 1995: the idea of the supramolecular synthon as a robust and predictable intermolecular interaction for crystal engineering (section 2.3.1).272 Subsequent researchers, particularly Aakeröy437 and Zaworotko,438 built on these principles to illustrate how CSD statistics could be used to quantify the relative strength of AA

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 25. (a) Possible outcomes of predicting which of a set of potential coformers will cocrystallize with a target compound. (b) Results from a recent application of multicomponent H-bond propensity analysis: successful cocrystallization was predicted if MCcutoff > 0 (MCcutoff = propensity of best hetero H-bond minus propensity of best homo H-bond). (c) Better results were obtained using MCcutoff > −0.1. Reprinted from ref 451. Copyright 2017 American Chemical Society.

structure of solid forms in relation to drug development. There is, however, also a growing interest in using modeling approaches, including knowledge-based applications, to analyze and predict particle properties like morphology, surface characteristics, and mechanical behavior. The potential to combine prediction of morphology with a knowledge-based assessment of likely interactions (FIMs, section 4.1.2) at surfaces was illustrated by a 2013 study.427 Interaction maps were plotted on the BFDH-predicted morphology458 of cipamfylline to help rationalize the different growth rates at key surfaces. The fast-growing (001) plane is perpendicular to a H-bonding direction, whereas the slowgrowing (010) plane shows no strong, directional interactions. Mugheirbi and Tajber459 and then Serrano et al.460 have subsequently used this method for itraconazole (Figure 26) and salbutamol sulfate, respectively. In each case, the use of FIMs alongside the morphology helped rationalize the likely interaction behavior at surfaces and hence the surface characteristics. This and similar methodologies may facilitate rational choices of crystallization solvents461 or doping agents in order to restrict growth at specific surfaces and hence change morphologies. Bryant et al. have recently applied sophisticated, structurebased approaches through the CSD Python Application Programming Interface (API)462 to predict mechanical properties of solid forms.463 The CSD Python API allows users to write Python scripts that will access, search, and analyze CSD data programmatically. Their impressive work combines a three-dimensional assessment of the interdigitation of the layers in a crystal structure together with analysis of H-bonding dimensionality to predict the most plausible slip planes (Figure 27). When used in conjunction with other structural descriptors, like plane spacing, the approach is shown by the authors to be extremely valuable, correctly predicting the relative tabletabilities measured for multiple drugs. This study is suggestive of the potential scope of the CSD Python API in the future, especially for the programming of knowledge-based

heterointeraction is more likely than the best homointeraction. The example studied was paracetamol, for which 35 potential coformers had been experimentally screened. Significant enrichment was obtained: 9 out of the 17 coformers (53%) that were predicted most likely to form cocrystals with paracetamol (difference in likelihoods between best heteroand best homointeraction ≥0.10) were shown to cocrystallize experimentally, compared with an overall coformer success rate of 40%. Delori and colleagues have applied these principles in two different cocrystal studies focusing on pyrimethamine.449,450 While the authors did not report extensive experimental screening results for pyrimethamine, they did suggest that the HBP analysis was useful as part of the screening process, alongside consideration of the difference in pKa between the target and coformers. They reported seven and six new adducts (salts or cocrystals) of pyrimethamine in the 2012 and 2013 papers, respectively, all of which were predicted to be likely based on the solid form informatics results. Another recent study by Sandhu et al. applied this multicomponent HBP approach to the prediction of cocrystals for six target compounds against 20 different possible coformers.451 Across the resulting 120 cocrystallization experiments, 41 out of the 47 coformers predicted to be likely to cocrystallize (difference in likelihoods ≥0.00) were shown to form cocrystals, compared to an overall coformer success rate of 52% (Figure 25). While still an evolving research method, this utilization of H-bonding likelihoods does show signs of promise in helping to optimize cocrystal screening experiments. Of course, other factors are relevant (e.g., conformational preferences, possibility of solvation), and several recent studies using the CSD indicate that prediction of cocrystal formation remains a challenging problem.328,452−454 A further complication is that cocrystals are perfectly capable of exhibiting polymorphism,455−457 raising the concerns discussed in section 4.1.2. 4.1.4. Morphology and Other Physical Properties. We have so far focused entirely on the atom-scale, internal AB

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

inter alia, 3D molecular descriptors calculated from CSD structures. Unfortunately, the inclusion of these descriptors had little effect on prediction accuracy, despite the undoubted relevance of crystal structures to solubility; the authors suggested that there were probably limitations in the descriptors used.466 4.2. Other Industrially Relevant Applications

The use of the CSD in the pharmaceutical and agrochemical industries is now well-established; as described above, this began in the 1980s in drug discovery and then developed to encompass solid form research and design. In the past decade, more examples of the use of the CSD in chemical industry have appeared, including research into new energetic materials, paints, dyes, catalysts, gas storage and separation materials, thin films, and other advanced functional materials. 4.2.1. Energetic Materials. Investigations of existing energetic materials and design of new materials with improved properties are aided by a knowledge of their solid-state properties. The performance of these materials can depend on factors such as crystal density, crystal morphology and chemical stability, all of which can be affected by the specific crystal packing pattern.467 Research groups are using the CSD in developing structure−property relationships to understand density trends,468,469 as well as to find inert density mock (i.e., surrogate) materials for safe research into energetic compounds like HMX.470 Co-crystallization appears to be emerging as a new option for carefully tailoring the properties of energetic materials, such as reducing the chemical instability or impact sensitivity of a material while retaining the detonation power.471,472 The CSD should continue to provide opportunities in this area as approaches in knowledge-based cocrystal design evolve.443 It also seems possible that the CSD could assist the design of explosive sensors by suggesting molecules that form strong π−π and H-bonding interactions with explosives such as picric acid. We are unaware of any such application of the CSD, but a recent publication473 suggests the idea. 4.2.2. Paints, Pigments, and Dyes. In paints, pigments, and dye materials, there are signs that the CSD is now being considered as a source of fundamental knowledge to rationalize structure−property relationships, as well as identify promising materials. A 2012 paper described how the CSD can be used to pick out common structural features in effective Ru- and Febased dyes and deliver new insights into the molecular origins of the dye properties.474 That work was focused specifically on dyes for use in dye-sensitized solar cells (DSSC), which are promising contenders for the next generation of photovoltaic technology. However, metal-containing dyes may be too expensive and environmentally unfriendly, so a later study looked for organic alternatives.475 Over 100 000 organic CSD structures were searched for molecules containing (donor)− (π-system)−(acceptor) moieties with appropriate charge separation and good conjugation (measured by the degree of bond-length alternation in the π system). About 500 structures satisfied these conditions. They were further filtered to give a single lead, which was synthesized and found to have encouraging DSSC efficiency, thereby opening up a new class of organic dyes. More recently, Veits et al. used the CSD for the rational design of a gel-based sensor for lead in paint through an understanding of intermolecular interactions and crystal morphologies.476

Figure 26. Full interaction maps showing positions on the (010) and (001) faces of itraconazole where H-bond donor, H-bond acceptor, and hydrophobic groups are likely to bind. The (010) face shows more preferred H-bonding sites. Water solvent molecules would therefore be expected to interact more strongly with this face, inhibiting its growth. Reprinted from ref 459. Copyright 2015 American Chemical Society.

Figure 27. Two polymorphs of sulfamerazine: topological analysis finds a slip plane for form I (top, in red) but not for form II, where the flattest layer is slightly interdigitated with the adjacent layer. Reproduced with permission from ref 463. Copyright 2018 Royal Society of Chemistry.

tools applied to research areas that have, until now, been on the periphery of traditional CSD applications. The CSD may have a role in the investigation of several other physical properties. CSD cocrystal structures were recently used by Rama Krishna et al. as a training set to derive neural-network models for prediction of melting point, lattice energy, and crystal density.464 Docherty et al. used CSD structures of drug-like molecules in a project aimed at deconvoluting solubility into solvation and crystal-packing contributions. This might help identify the solubility-limiting features of active but inadequately soluble compounds.465 The relationship between solubility and temperature can be important in process chemistry, and Marchese Robinson et al. tried to derive statistical models for its prediction using, AC

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

4.2.3. Organic Semiconductors. Schober et al. used the CSD to develop a diverse library of organic molecular structures which, when combined with DFT transfer integral calculations, identified a number of promising semiconductor materials based on their calculated properties (electronic couplings and intramolecular reorganization energies).477 In a later study, the same group performed similar calculations and then clustered the molecules, first on their molecular scaffolds and then on the side groups with which these scaffolds were functionalized.478 This enabled them to obtain significant structure−property relationships for both scaffolds and side chains, raising the prospect of knowledge-based organic semiconductor design. In addition to their diversity, the compounds in the CSD are proven to be crystallizable, which, despite recent advances (section 2.1.6), can be difficult to predict. 4.2.4. Nonlinear Optical Materials. In a further example of virtual screening, Cole and Kreiling used the CSD to identify promising organic nonlinear optical (NLO) materials by searching for compounds structurally related to tetracyanoquinodimethane (TCNQ) and studying their bond length alternations (cf. section 4.2.2).479 NLO materials show potential for tailoring to achieve specific properties using crystal engineering concepts like cocrystallization. Wojnarska et al. used the CCDC application Full Interaction Maps (section 4.1.2), alongside other theoretical approaches, to rationally design cocrystals of sulfanilamide with the desired crystal symmetry and NLO properties.480 4.2.5. Ferroelectricity. Ferroelectric materials have a spontaneous electric polarity that can be reversed by applying an external electric field and have many (opto)electronic and electromechanical applications. Like NLOs, they must crystallize in polar space groups. Shi et al. described how pairs of polymorphs can be found in the CSD, one member of a pair being paramagnetic, the other ferroelectric.481 The paramagnetic forms exist at higher temperatures and are disordered so that, typically, they occupy centrosymmetric space groups. At lower temperatures, they become ordered and switch to polar space groups. Other workers have searched the CSD to find structures in polar groups that have switchable Hbond networks, e.g. chains of H-bonds between β-diketone enol moieties (O=C−C=C−OH). Reversal of an electric field causes a concerted switch of O−H···O bonds to O···H−O, with concomitant switching of C−C and C = C bonds, thereby reversing the polarity. Redetermination of some of these structures and/or synthesis of analogues has been used to find promising ferroelectric materials, sometimes with other desirable physical properties (e.g., flexibility).482−485 4.2.6. Magnetic Anisotropy and Single-Molecule Magnets. As we saw earlier (section 2.1.5.8), analysis of metal coordination geometries in the CSD can sometimes give insights into the requirements for magnetic behavior, though the structure−property relationships can be complex. One area of industrial relevance is single-molecule magnets (SMMs), which are potentially useful both for data storage and quantum computing. Gómez-Coca et al. noted that a d8 transition metal with trigonal pyramidal coordination has orbital degeneracy that must be broken by Jahn−Teller distortion.486 This should result in a first excited state very near in energy to the ground state, a situation associated with the high magnetic anisotropy required for SMMs. However, the larger the distortion, the greater the energy gap and the lower the anisotropy. They therefore searched the CSD and selected as synthetic targets a

number of Ni(II) complexes that had very small distortions from the ideal polyhedron. As a result, they synthesized a mononuclear nickel complex with a huge magnetic anisotropy, the largest that had been reported at that time (Figure 28). This exemplifies the potential use of the CSD for finding novel magnetic materials.

Figure 28. Top: structure of a Ni(II) complex having high magnetic anisotropy, designed with the help of the CSD. Bottom: d orbitals in (left) perfect and (right) distorted trigonal pyramids. Reprinted from ref 486. Copyright 2014 American Chemical Society.

4.2.7. Catalysts. There is a significant amount of unrealized potential in catalyst design and analysis using CSD knowledge. The CSD contains a wealth of experimentally validated structural data associated with both metal−ligand coordination geometries and the geometries of ligands themselves. Two illustrative papers from the 1990s by Müller and Mingos and by Smith et al. showed how Tolman cone angles or cone angle radial profiles (CARPs) of phosphine and phosphite ligands can be derived directly from CSD data.487,488 Analysis of these cone angles properties can aid rationalization of reaction rates and chemical equilibria through assessment of steric bulk. A similar study has been performed on structure− property trends in ligand bite angles within the CSD by Dierkes and van Leeuwen, where it was shown that CSD ranges for bite angles correlate very well with the results of theoretical calculations.489 More recent applications of the CSD to catalyst research have exploited other types of geometrical information. Novikov et al., seeking to optimize the stereocontrol of olefin epoxidation, hypothesized a relationship between the degree of enantioselectivity and the dihedral angle between the two aryl rings of the catalyst.490 Although only a few relevant structures could be found in the CSD, they provided enough dihedral information to guide design of new biaryl azepinium catalysts which gave enantiomeric excesses that supported the hypothesis. Kulik et al. devised a computational method for catalyst design, which they used to find 4-coordinate Zn(II)Nsp3 complexes in the CSD whose geometries (specifically, Zn−N bond lengths) were predicted to be optimum for catalyzing the hydration of carbon dioxide.491 CSD data on W−C bond lengths and metal-coordinated isocyanide bond angles helped to clarify the bonding in a key intermediate in the cleaving of carbon−carbon bonds by tungsten insertion.492 4.2.8. Gas Storage and Separation. Research in the field of gas storage and separation has been transformed over the last two decades or so with the popularization of metal− AD

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 29. CSD MOF collection allows rapid access to geometric properties of chosen MOF subsets; the properties shown here are for the subset with pores large enough to be accessed by a N2 probe molecule. Reprinted from ref 499. Copyright 2017 American Chemical Society.

Figure 30. (a) Calculated oxygen volumetric and gravimetric deliverable capacities of MOFs in a database derived from the CSD. (b) Crystal structure (supercell 2 × 2 × 1) of the top-ranked material, cavities represented by purple spheres. Reprinted from ref 501 under Creative Commons License (https://creativecommons.org/licenses/by/4.0/). Published by Nature Publishing Group.

organic frameworks (MOFs) as porous materials for selective binding.493−495 MOFs are almost infinitely tailorable to produce different porosity profiles and pore chemistries, but one of the key challenges is rational design or selection of a framework that is experimentally accessible and stable. A number of research groups have tackled this problem by finding and characterizing known MOFs in the CSD,496−498 culminating in a definitive CSD MOF subset (Figure 29).499 This approach means that MOF materials rationally selected for their properties from the subset are already confirmed to be synthesizable and crystallizable. High-throughput screening of MOFs for gas-storage and -separation applications can be achieved with grand canonical Monte Carlo adsorption simulations. Altintas et al. and Moghadam et al. published examples aimed at hydrogen/ methane separation and oxygen storage, respectively.500,501

The latter was particularly impressive because the top-ranked MOF retrieved from a CSD-derived MOF database was resynthesized and shown experimentally to deliver 22.5% more oxygen than the best material known beforehand (Figure 30). Park and co-workers have further extended the analysis of known MOFs in the literature and within the CSD to mine further property information using natural language processing techniques.502 These approaches are not limited to gas molecules, of course; Inokuma et al. have shown that similar principles can be used to mine the CSD for potential crystalline sponges to capture larger guest molecules.503 Nor is interest confined to MOFs: a recent Python application has been written to search structures from the CSD for molecular pores, i.e. cage- and belt-like molecules such as buckyballs and cyclophanes.504 AE

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

4.2.9. Thin Films and Coatings. Seiki et al. illustrated that it is possible to rationally design highly stable, oriented organic thin films using the CSD.505 They theorized that a 2D hexagonal close-packed structure is ideal for stable thin films and then found a family of propeller-shaped, rigid molecules (tripodal triptycenes) in the CSD with the appropriate packing properties. The authors managed to generate well-ordered, multilayer films based on this triptycene base, which packs well in 2D, with long alkyl chains extending perpendicular to the plane of the 2D layers (Figure 31).

a variety of ab initio calculations (e.g., of isomerization enthalpies) to identify a number of possible candidates for energy storage. 4.3. Structure Solution

The CSD can assist the determination of small-molecule structures from single-crystal data, e.g. by providing restraint values for the refinement of disordered groups.508 It can also be helpful in NMR crystallography.509 Its major uses, however, are to assist the determination of macromolecular crystal structures and small molecule structures from powder diffraction data. 4.3.1. Macromolecular Crystal Structure Determination. The superior precision of the CSD makes it a useful accessory for the solution of macromolecular crystal structures. In its first major application of this type, amino-acid bondlength and -angle restraint values were derived from the CSD for use in protein structure refinement.40 The focus of this work was the refinement of protein backbones, but interest nowadays has turned to restraint values for the refinement of protein-bound ligands, a more difficult problem because of their chemical diversity. Mogul (section 2.1.3) is of major use as its libraries contain CSD-derived distributions of a huge variety of bond lengths, bond angles, and ring conformations. The program grade,46 for example, bases restraint values on Mogul where possible and resorts to quantum-mechanical calculations only when Mogul does not contain sufficient relevant data. Furthermore, the standard deviations of Mogul distributions can be used to determine how tight or loose restraints should be.48 Equally important is the use of Mogul for checking refined ligand geometries, which is one aspect of a current initiative by the crystallographic community aimed at the validation and, if necessary, rerefinement of published protein crystal structures.510−514 Ligand geometries are particularly important because the binding site of a protein−ligand complex is almost always a major point of interest. The significance of the difference between an observed bond length or angle (D) and the mean value of the corresponding Mogul distribution (d) can be measured by Z = (D - d)/σ, where σ is the standard deviation of the distribution. Hence, the conventional measure of the quality of a ligand geometry is RMSZ, the root-meansquare Z-value of the bond lengths and/or angles. Disappointingly, recent surveys of the PDB show that bond-length and -angle RMSZ values have shown little improvement over the years.511,515 Smart et al. observed that RMSZ tends to rise with ligand size. This, they argued, does not necessarily imply that large ligands are “worse” than small ones, because restraints are easier to satisfy for the latter with typical data resolution and electron density.516 On the other hand, it appears to us that gross errors in fitting a ligand to electron density are inevitably more likely when the ligand is large and flexible; there are simply more ways to get it wrong. The CSD is an unrivalled source of information on bond lengths to metals and the geometries of metal coordination spheres, and both of these can be useful in macromolecular structure determination.517−521 Moriarty and Adams used the CSD to set up bond-length and -angle restraints for Fe4S4 clusters and the exocyclic Fe−S bonds to cysteine side chains. Their model contained no right-angles in contrast to previous, idealized coordinates.522 A more unusual application of the database was to investigate the possibility of structural changes due to

Figure 31. Schematic depiction of the nested hexagonal packing of propeller-shaped triptycene molecules to form layers from which thin films can be derived. Adapted with permission from ref 505. Copyright 2015 American Association for the Advancement of Science.

Organic thin films can potentially be used in a range of optoelectronic and other devices, the construction of which may require molecular beam epitaxy to deposit the thin films onto a crystalline substrate. Unfortunately, not many suitable substrates are known. β-Alanine crystals are good because their cleavage along (010) planes presents clean surfaces with low roughness. With this in mind, Zolotarev et al. used ToposPro (section 2.3.2) to search the CSD for other structures of amino acids and their derivatives that might constitute suitable, cheap, and nontoxic substrates.506 They looked for nonsolvated structures with 2D H-bonded networks and anisotropic intermolecular-interaction energies. Several structures were found that they predicted would cleave easily. This was confirmed experimentally on 11 structures of amino-acid derivatives. They also discovered that the crystals could be deformed when pressure was applied to the cleavage face but would break if stressed in other directions. 4.2.10. Solar Thermal Fuels. To be suitable as a solar thermal fuel, a substance must be capable of absorbing photons to give rise to a long-lived metastable form that can be transported and, when required, exposed to a trigger (e.g., heat) to initiate relaxation to the ground state. That the CSD may have a role in this area was illustrated by Liu and Grossman.507 They searched the database for molecules containing groups that might photoisomerize and performed AF

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

their CSD crystal structures. They thereby established that the new structure was dissimilar.531

perdeuteration. No significant changes were observed, although proteins “stiffen” in D2O.523 4.3.2. Structure Determination from Powder Diffraction Data. Over the last 20 years, the determination of crystal structures from powder diffraction data has become increasingly viable using laboratory equipment (i.e., without the need for a synchrotron source) and the number of “powder structures” in the CSD has risen sharply.524,525 It can be a very attractive option when it proves difficult or impossible to grow a crystal suitable for single-crystal diffraction. Methods for structure solution from powder data have been summarized by Shankland et al.526 Overlap of reflections and a rapid decline in scattering with increasing diffraction angle lead to relatively low information content in powder diffraction patterns. Application of conventional direct methods to structure solution is therefore very challenging. The alternative is to solve structures by creating a model of the asymmetric unit, the model comprising the position, orientation, and geometry (including conformation) of each symmetryindependent molecule or ion. A global search technique such as simulated annealing is then used to adjust the model until its calculated powder pattern has a good fit with the observed pattern. Rietveld refinement is then used to refine this initial solved structure so as to optimize the fit. Restraints are essential at this stage to maintain chemically sensible molecular dimensions. It will probably be obvious from this description that the CSD and Mogul can aid structure determination in three ways. First, they can provide sensible geometries for the starting model. For example, when Hughes et al. solved the structure of 3′,5′-bis-O-decanoyl-2′-deoxyguanosine, they built their model from CSD guanosine structures together with Mogul average bond lengths and angles for the C10 chains.527 Second, the CSD and Mogul can be used for determining refinement restraints. Third, they can be used to validate the final solution. Published examples are numerous, e.g., Ferreira et al.50 and Ghouili et al.528 A more sophisticated use of Mogul is to reduce the search space that must be covered during structure solution by biasing against, or prohibiting completely, torsion-angle settings that are uncommon in the CSD. This can sometimes backfire: Cole et al. reported that biasing against a minor peak in a Mogul distribution reduced the chance of solving the structure of verapamil hydrochloride, which happens to crystallize with a torsion angle belonging to that peak.525 In general, however, the use of conformational information from the CSD and Mogul considerably increases the chances of structure solution and extends the approach to larger and more flexible molecules. This was emphatically demonstrated in a study of 51 structures by Kabova et al.529 CSD structures can occasionally provide good starting points for refinement. For example, Fernandes et al. solved the structure of a cobalt complex from powder data by taking the CSD structure of a closely analogous copper complex, making appropriate metal and halogen replacements, and using the result as a starting point for Rietveld refinement. This gave a good fit, suggesting the structures were isostructural. This was then confirmed by performing the structure solution in the normal way, but keeping the ligand geometries fixed at those observed in the copper-complex.530 Conversely, before solving the structure of anhydrous rifampicin, Ibiapino et al. compared its powder pattern with those of solvated forms calculated from

5. LESSONS FROM THE PAST AND PROSPECTS FOR THE FUTURE 5.1. Limitations of CSD-Based Research

5.1.1. Chemical Content of the CSD. Although we believe the CSD to be unusually diverse, we must also acknowledge that there are biases in its content, imposed by chemical fashion, synthetic accessibility, and the ease of crystallization and structure solution. Moreover, while a million structures in the CSD represent a notable milestone, they are but a tiny sprinkling in the immensity of chemical space.532 Consequently, it is not uncommon for a CSD search to produce a disappointing number of hits. For example, Kuhn et al. complained about the paucity of heterocyclic systems linked by a single, acyclic bond;358 Główka commented that aromatic rings are usually substituted, making it hard to infer their inherent intermolecular contact preferences.216 Even a few hits can be useful, but they are unlikely to lead to robust conclusions. Furthermore, huge numbers of crystal structures are solved but never published. Direct-deposition schemes such as IUCrData533 and CSD Communications534 offer a relatively easy way for crystallographers to release their results and will hopefully reduce the tendency for structures to lie forgotten in private computer files. 5.1.2. Problems with Crystal Structures. The most common problem with small-molecule crystal structures is disorder, which is found in about a quarter of CSD entries. Usually, only a small part of the structure is affected, such as a solvate molecule or a small group or counterion (e.g., −CF3 and ClO4−). It is commonplace, however, for users searching the database to filter out all disordered structures, even if the fragment they are interested in is unaffected. This is for the entirely defensible reason that the main CSD search program, ConQuest, does not offer a more sophisticated disorderfiltering option. Furthermore, disorder is not always resolved, which can lead to the inclusion of erroneous geometries in retrieved samples, e.g., ostensibly flat tetrahydrofuran rings. Other problems include bond lengths shortened by libration, missing or erroneously positioned hydrogen atoms (which sometimes lead to incorrect tautomer assignments101,103), badly misplaced non-hydrogen atoms (which can appear in structures that nevertheless have good R-factors), and incorrect space-group assignments. Some of these problems can be worked around quite easily, e.g. by ignoring gross outliers, or by normalizing the lengths of covalent bonds involving hydrogen to average neutron-diffraction values. They all, however, present potential pitfalls for users lacking crystallographic knowledge. 5.1.3. Accuracy, Presentation, and Extent of CSD Data. The most common type of CSD search is for substructures, and this, of course, relies on the accuracy of the chemical structures assigned to CSD entries. Outright errors are unusual but by no means absent. A more common problem is that structures may be of defensible accuracy but not assigned in the way the user expects. This is particularly an issue for metal and metalloid complexes. The location of formal charges, for example, can be exceptionally difficult; sometimes they are placed almost arbitrarily, e.g. on Keggin structures. This is partly because of a deficiency in CSD AG

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

the Cambridge Structural Database...and the 2.5 × 104 structures of biological macromolecules in the Protein Data Bank...one can find examples of just about any proposed intermolecular interaction involving peripheral atoms.”538 Of course, the numbers are much greater now, so it is even easier! Crystallization requires compromises (torsion angles may be nonideal, unfavorable contacts may occur), so the occasional observation of surprising features proves nothing. Above all, it is a question of attitude; the CSD should be used to test a hypothesis, not to prop it up.

structure representation; only integral formal charges can be assigned, even when distributed partial charges are closer to reality. The CSD supports quadruple bonds but not quintuple (as in, e.g., WISTOJ535) or fractional (e.g., ROZROO536). Some chemical bonds are so long that it is a judgment call whether they are bonds at all. There are the usual issues with differing bond-type representations (e.g., aromatic versus Kekule-style single and double). While the CSD has conventions for these, they are not always easy for users to anticipate. Some data items are either missing or difficult to use. For example, oxidation states are specified only in compound names, not in the assigned chemical connectivity, though efforts are underway to infer them from metal−ligand bond lengths by use of the bond valence sum method.537 Anisotropic displacement factors were entirely absent from the CSD until a couple of years ago, though they are now available for over 675 000 entries. Bajpai et al. noted that the CSD lacks experimental details about crystallization procedure.333 One such detail, crystallization solvent, is in fact recorded wherever possible, but only as a text item (although it can sometimes also be inferred by the inclusion of solvent in the crystal lattice). Searching for text items can be difficult because a variety of words may be used to describe the same or similar things (e.g., “ethanol”, “ethyl alcohol” and “EtOH”). Having said that, text fields often contain valuable information, such as metal spin states, biological sources, etc. It is quite possible that many users are unaware of this. The fundamental database format of the CSD has been changed recently, which will enable the rearrangement of many of these items into more sensible and accessible formats (e.g., oxidation states on the atom and numeric values like pressure captured as numbers, rather than text). An additional problem was pointed out by a referee of this Review. Some of the information supplied to CCDC for a new crystal structure may relate not to that structure but to a previously determined structure whose file was used as a template for the file of the new structure! The resulting errors may well be completely undetectable. 5.1.4. The Relevance of CSD Data. Extrapolating what is seen in small-molecule crystal structures to solution or in vivo situations is occasionally problematic. Elsewhere in this Review (sections 3.1 and 3.2), we noted three structural features that create difficulties: molecules situated on crystallographic inversion centers, dipole−dipole interactions across inversion centers, and high H-bond donor-to-acceptor ratios in protein structures. The good news is that such problems can often be worked around, and they are, as far as we know, few in number. Proving a negative is notoriously difficult. Nevertheless, there is enough collective experience for us to be confident that smallmolecule crystal structures are usually a good guide to conformations and interactions in solution or at a protein binding site. 5.1.5. Problems Endemic to Data Analysis. Any type of data analysis can set traps for the unwary. Differences may be observed between two samples, but they may not be statistically significant. Even if a relationship is significant, it may not be causative, and if it is, the cause may be misinterpreted by the researcher. Another possibility is confirmation biasresearchers finding what they are looking for without seeing it in context. One of our favorite remarks on crystallographic databases dates back to 2005: “With the help of the 3 × 105 organic and organometallic crystal structures in

5.2. The Keys to Success

A great thing about the physical sciences is that they will not allow a silly idea to survive indefinitely (unlike, for example, politics or economics). There will always be hard facts that can be unearthed to discredit errors in published papers, and someone will eventually find them. That is what makes our profession so satisfying but also so demanding; we have to be very careful not to make egregious errors that will lead to eventual embarrassment. Most scientists learn a hard lesson very quickly: even apparently inconsequential details of research methodology can matter and, if they are not right, lead to incorrect results and conclusions. Hence, there is an overwhelming importance in science of accurately recording details and making it possible for others to find them. In the context of the CSD, much of this work is done by the editors of the database. Here is a small selection of what they do: (a) Disentangle complex disorder situations. In addition, a severely disordered species may be refined without the use of explicit atom positions (the SQUEEZE procedure539 ), requiring editors to search the paper or associated files in an attempt to find out what the species is. (b) Assign bond types and formal charges to each incoming structure, thereby enabling substructure searching of the CSD. A program is used to make the initial assignment and is often but not always correct.540 Circumstances that might trip it up include the presence of radicals, missing or misplaced hydrogen atoms, and the combination of redox-active ligands and metal atoms that can readily adopt more than one oxidation state. Deciding whether metal−metal bonds are present can be troublesome and may require an editorial judgment based on information (possibly conflicting) from the literature. Working out formal charges and hydrogen-atom assignments is often difficult, e.g. for MOFs, where the attention of the crystallographers is sometimes focused more or less entirely on the framework. MOFs based on cubane-type polynuclear clusters are particularly challenging as they contain a mixture of oxo and hydroxo bridges and the hydroxo protons are hardly ever located. Extremely large and complex structures can be challenging. For example, a structure might contain several intricate metal-containing residues that are at first sight chemically identical but cannot all be matched on each other by the CCDC’s graph-matching software. It is then necessary to work out whether and how they are different. (c) Ensure that each entry has a chemical name. Again, software is an enormous help,541 but efforts are made to make the names as helpful and discoverable as possible. The nomenclature of new types of chemistry (e.g., graphenes) may require pause for thought. (d) Communicate with authors when errors are found or suspected in structures. This is particularly important for CSD AH

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

On that final point, crystallographers not only solve the structures; they also cooperate fully in the essential task of ensuring that the CSD is kept comprehensive. In addition, many crystallographers have made other crucial contributions. For example, the CCDC would have been unable to handle the increasing numbers of structures had it not been for the work of those who collaborated in the development of the CIF format for transferring crystal-structure results.545 Equally important was the implementation of CIF output in the main programs for crystal-structure refinement, especially the dominant SHELX system.546 Further, many errors have been found in the CSD by crystallographers outside CCDC. Two great examples are Dick Marsh and Ton Spek. The former published so many space-group corrections that his name became a verb. No crystallographer was pleased to be “marshed”, although, amusingly, Marsh once graciously admitted he had been marshed himself.547 Spek has improved the quality of structures appearing in the literature by his work on crystal-structure validation software548,549 and by pointing out some of the errors that nonetheless creep in. Particularly praiseworthy was his exposure of a number of fraudulent crystal structures.550 (Interestingly, CCDC has recently started to investigate new methods for detecting plagiarized or fraudulent data.551) In summary, the CSD has been a success because someone has had the specific accountability and funding for its maintenance, there has been from the start an emphasis on quality, and the publishing and crystallographic communities have wholeheartedly supported the endeavor. There are two other reasons that we have yet to mention: it was established early enough that the backlog of published structures could be properly handled, and crystal structures are a wonderfully rich source of information.

Communications, where the editors effectively play a peerreview role. (e) Correct old entries that have been found to contain errors, or improve them in other ways (e.g., adding better chemical diagrams, or additional searchable items such as bioactivity, sources of natural products, etc.). In short, the editors do a job that is unglamorous and low profile, but users of the CSD owe them a great deal. An example problematic structure is shown in Figure 32. Of

Figure 32. An example structure needing significant editorial attention: (a) there is disorder in three places, some of it imposed by symmetry; (b) one of the atoms was assigned an incorrect element type in the input file; (c) charge assignment was nontrivial (the main cation is 9+); (d) there was difficulty confirming that the assigned chemical structure was correct; (e) solvent molecules were fitted by SQUEEZE so their identities had to be determined by reference to the accompanying paper; (f) any 2D chemical diagram would have been indecipherable, so the structure had to be represented schematically. If you cannot understand the structure, that is exactly the position of the editor who first views itonly (s)he has to sort it out!

5.3. Future Challenges

5.3.1. The Advent of Big Data. The CSD was once one of the few, now it is one of the many; there are scientific databases all over the place. Furthermore, collecting data used to be an unfashionable and boring activity but has now become exciting and à la mode. Online information can be gathered together by spiders and crowdsourcing and web indexed; an obvious example is ChemSpider. 552 Add in artificial intelligence and cognitive augmentation, with neural nets combined together into deep-learning algorithms, and big data is upon us. There seems no limit to what it can do. If it is any comfort, the human brain has 1011 neurons and 1014 synapses, so we might still have a role to play. Having become a small fish in a big pondwell, “mediumsized fish” is probably a more apposite metaphorthe first requirement for the CSD of the future is that it should be accessible to client applications as well as to humans. Its value will be fully realized only if it can be linked to other data compilations. CCDC has taken the critical step that will enable this goal to be achieved by releasing the CSD Python API (section 4.1.4). It is likely to be directly used (as opposed to indirectly) by only a minority of the CSD user community but nonetheless is of primary strategic importance. 5.3.2. The Continued Evolution of Crystallography. We said in the Introduction that a small-molecule crystal structure can now be determined in a few hours. This is by using standard equipment. Advanced instrumentation with new hybrid pixel detectors enables data to be collected in a few minutes. Furthermore, a new software feature (Rigaku’s “What

course, more and more of the editors’ work has benefited from increasingly powerful software. On the other hand, the structures get more complex and aspirations for database content and functionality continually rise. It is a case of running to stand still. Also important is the provision of software for searching and analyzing the CSD. We will not dwell on this as the relevant programs are described elsewhere.43,376,408,445,542,543 However, we wish to emphasize that software changes have occurred frequently over the years in response to new user requirements and computing environments. Associated changes in database content have also occurred, such as the addition of Digital Object Identifiers (DOIs) to CSD deposited data sets and links from CSD data sets to original publications. This essential task of ensuring that the database and its software are of high qualitya trusted data sourceis likely to be achieved only if there are motivated people who have that specific accountability, who have reliable, long-term funding,544 and who can rely on the support and involvement of scientific publishers and the crystallographic community. AI

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

is this?”553) will try to solve the structure as the data collection proceeds, enabling a basic atomic coordinate set to be obtained in less than 2 min. Add to this the increasing number of 3D structures solved by powder-diffraction and the continued development of techniques like NMR crystallography, microelectron diffraction, and cryo-electron microscopy, and the mind boggles at how many structures appropriate for inclusion in the CSD will be produced annually in the years to come. Also, crystal structure prediction may become sufficiently reliable that at least some of its results will be deemed suitable for inclusion in the CSD (albeit, flagged as “theoretical”).554−556 We may assume that the exponential rise in crystallographic output must eventually flatten off, but there is no reason to expect it any time soon. It is nevertheless important that the current focus on maintaining CSD data quality is maintained. It might be thought that, with so many structures, a pool of “broken structures” (incorrect chemistry assignments, etc.) can be tolerated. The problem is that there will still be many searches that produce few hits; multiply the number of structures in the CSD a hundred-fold and they will still represent a small fraction of chemical space. Further, using the CSD in big-data analyses, possibly with sophisticated machine-learning techniques, is likely to bury the effects of database errors so deeply that they will become impossible to detect. 5.3.3. The Continued Evolution of Chemistry. A proportion of the CSD input comprises very complex molecules, frequently large, often polymeric, and sometimes with unusual bonding or exotic topologies (e.g., Figure 33).

must remember that meeting them will greatly increase the value of the CSD. Einstein, as usual, hit the nail on the head: in the middle of difficulty lies opportunity.

6. CONCLUDING REMARKS Like Escher’s Stone Age man, we are amazed by crystalsnot by their sparkling regularity, which we understand, but by the infinite variety of their contents and the knowledge that can be accrued by collecting together and analyzing the details of these contents. Our Review has surely shown that the popularity and success of this technique owes much to the sagacity of the founders of the CSD. And to others, too: the oldest structure in the CSD was published in 1924, so the database is the product of almost a hundred years of effort involving hundreds of thousands of scientists. It never does to be complacent, but the extended crystallographic community can pat itself on the back. Are we bright enough to see into the future? To some extent, yes: the major research applications of the CSD described in sections 2 and 3 show no signs of diminishing; its uses in drug development (section 4.1) will surely increase, and at least a proportion of the speculative applications discussed in section 4.2 should grow in importance. However, we do not have the vision to see with clarity the bigger picture. How will the CSD be used in combination with new and complementary databases, on emerging areas of materials science and nanotechnology, and with results from cuttingedge techniques of structural analysis? We can only wait and see but are confident that the CSD will inexorably grow from strength to strength. ASSOCIATED CONTENT S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.chemrev.9b00155. Table of CSD refcodes with publication references and diagrams (PDF)

AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. ORCID

Figure 33. Example of an esoteric 12+ cation taken from the CSD structure OFATAT. The ligand is a complex molecular knot, having a continuous 324-membered ring with nine alternating crossings; the compound’s chemical name requires 471 characters. The ring includes 18 2,2′-bipyridyl groups; three are coordinated to each of the six Fe2+ ions. The cation lies on a 3-fold rotation axis in the space group R3̅.

Robin Taylor: 0000-0002-0391-2609 Peter A. Wood: 0000-0002-5239-2160 Notes

The authors declare no competing financial interest. Biographies

Representing these chemistries in an accurate, searchable form is already challenging and will only become more so. There is perhaps a perception that the molecules in the CSD are simple, small, and much easier for database builders to deal with than biological macromolecules. The ingenuity of synthetic chemists has changed that. The solutions to the problems outlined in this and the previous subsections must primarily lie in improvements to the software infrastructure around the CSD, for both building and searching the database. This will be a difficult undertaking. Nevertheless, it is appropriate for us to introduce a positive note. We have talked about the challenges of the future but

Robin Taylor obtained a BA in chemistry and a Ph.D. in crystallography from Oxford and Cambridge Universities, respectively. His career was split between academia, industry, a London hospital, self-employment, and, primarily, the not-for-profit company CCDC. He did research into crystallography, structural chemistry, molecular recognition, chemical software, and computer-aided agrochemical design, with a soupçon of medical statistics thrown in. He frequently reflects on his immense good fortune in earning his living doing something that now, in his retirement, he does for fun. Peter Wood is a Senior Scientist and the Product Manager for CSDSystem and CSD-Materials at the Cambridge Crystallographic Data AJ

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(10) Schneider, H. J. Limitations and Extensions of the Lock-andKey Principle: Differences between Gas State, Solution and Solid State Structures. Int. J. Mol. Sci. 2015, 16, 6694−6717. (11) Bondi, A. Van der Waals Volumes and Radii. J. Phys. Chem. 1964, 68, 441−451. (12) Alvarez, S. A Cartography of the van der Waals Territories. Dalt. Trans. 2013, 42, 8617−8636. (13) Rowland, R. S.; Taylor, R. Intermolecular Nonbonded Contact Distances in Organic Crystal Structures: Comparison with Distances Expected from van der Waals Radii. J. Phys. Chem. 1996, 100, 7384− 7391. (14) Nyburg, S. C.; Faerman, C. H. A Revision of van der Waals Atomic Radii for Molecular Crystals: N, O, F, S, Cl, Se, Br and I Bonded to Carbon. Acta Crystallogr., Sect. B: Struct. Sci. 1985, B41, 274−279. (15) Eramian, H.; Tian, Y.-H.; Fox, Z.; Beneberu, H. Z.; Kertesz, M. On the Anisotropy of van der Waals Atomic Radii of O, S, Se, F, Cl, Br, and I. J. Phys. Chem. A 2013, 117, 14184−14190. (16) Hu, S. Z.; Zhou, Z. H.; Xie, Z. X.; Robertson, B. E. A Comparative Study of Crystallographic van der Waals Radii. Z. Kristallogr. - Cryst. Mater. 2014, 229, 517−523. (17) Taylor, R. Short Nonbonded Contact Distances in Organic Molecules and Their Use as Atom-Clash Criteria in Conformer Validation and Searching. J. Chem. Inf. Model. 2011, 51, 897−908. (18) Wood, P. A.; McKinnon, J. J.; Parsons, S.; Pidcock, E.; Spackman, M. A. Analysis of the Compression of Molecular Crystal Structures Using Hirshfeld Surfaces. CrystEngComm 2008, 10, 368− 376. (19) Cordero, B.; Gómez, V.; Platero-Prats, A. E.; Revés, M.; Echeverría, J.; Cremades, E.; Barragán, F.; Alvarez, S. Covalent Radii Revisited. Dalt. Trans. 2008, 2832−2838. (20) Khorasani, S.; Fernandes, M. A.; Perry, C. B. Do 12-Membered Cycloalkane Rings Only Exist as One Conformation in the SolidState? A Detailed Solid-State Analysis Involving Polymorphs of N,N′Biscyclododecyl Pyromellitic Diimide. Cryst. Growth Des. 2012, 12, 5908−5916. (21) Pérez, J.; García, L.; Carrascosa, R.; Pérez, E.; Serrano, J. L. Solid State Conformational Preferences in Transition Metal Complexes Double Bridged by Phosphate and Related Ligands. Polyhedron 2008, 27, 2487−2493. (22) Claramunt, R. M.; Alkorta, I.; Elguero, J. A Theoretical Study of the Conformation and Dynamic Properties of 1,5-Benzodiazepines and Their Derivatives. Comput. Theor. Chem. 2013, 1019, 108−115. (23) Majerz, I.; Dziembowska, T. Aromaticity of Benzene Derivatives: An Exploration of the Cambridge Structural Database. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2018, 74, 148− 151. (24) Takahashi, O.; Yamasaki, K.; Kohno, Y.; Kurihara, Y.; Ueda, K.; Umezawa, Y.; Suezawa, H.; Nishio, M. The Conformation of Alkyl Cyclohexanones and Terpenic Ketones. Interpretation for the “Alkylketone Effect” Based on the CH/π(C = O) Hydrogen Bond. Tetrahedron 2008, 64, 2433−2440. (25) Takahashi, O.; Kohno, Y.; Nishio, M. Relevance of Weak Hydrogen Bonds in the Conformation of Organic Compounds and Bioconjugates: Evidence from Recent Experimental Data and HighLevel Ab Initio MO Calculations. Chem. Rev. 2010, 110, 6049−6076. (26) Reid, R. C.; Yau, M. K.; Singh, R.; Lim, J.; Fairlie, D. P. Stereoelectronic Effects Dictate Molecular Conformation and Biological Function of Heterocyclic Amides. J. Am. Chem. Soc. 2014, 136, 11914−11917. (27) Galek, P. T. A.; Fábián, L.; Allen, F. H. Universal Prediction of Intramolecular Hydrogen Bonds in Organic Crystals. Acta Crystallogr., Sect. B: Struct. Sci. 2010, 66, 237−252. (28) Kuhn, B.; Mohr, P.; Stahl, M. Intramolecular Hydrogen Bonding in Medicinal Chemistry. J. Med. Chem. 2010, 53, 2601− 2611. (29) Cruz-Cabeza, A. J.; Allen, F. H. Conformation and Geometry of Cyclopropane Rings Having π-Acceptor Substituents: A Theoretical

Centre, UK. He graduated from the University of Edinburgh with a Masters in Chemical Physics in 2004. He then stayed in Edinburgh to do a Ph.D. in X-ray Crystallography under the supervision of Professor Simon Parsons and Dr. Elna Pidcock (CCDC), studying the effect of high pressure on the topological properties of molecular crystal structures using multiple structural analysis techniques. Peter then joined the CCDC in 2007 as a Research and Applications Scientist, continuing to focus on the connection between database studies, experimental results, and computational calculations. In 2011, he was awarded the BCA/CCG Chemical Crystallography Prize for Younger Scientists for his early career research contributions in this field. Over the past decade, Peter has guided the development of key scientific software components in this area, including Mercury, ConQuest, WebCSD, Mogul, IsoStar, DASH, and the CSD Python API. His research interests center around the scientific applications of the structural knowledge stored in the Cambridge Structural Database, particularly in structural chemistry, structure−property relationships, crystal engineering, and drug development and formulation.

ACKNOWLEDGMENTS We are very grateful to Stephen Holgate for his extensive guidance on editorial procedures at CCDC. We also extend our thanks to Simon Coles for updating us on the capabilities of contemporary structure-solution equipment and software; Jason Cole for making insightful comments on the manuscript; Santiago Alvarez for supplying us with Figure 1; and Andrew Bond, Carol Brock, Andrew Maloney, and Patrick McCabe for help with other figures. One of the referees is thanked for suggesting an improved caption for Figure 33, which we have used verbatim. R.T. thanks CCDC for their kind award of an Emeritus Research Fellowship. REFERENCES (1) Groom, C. R.; Bruno, I. J.; Lightfoot, M. P.; Ward, S. C. The Cambridge Structural Database. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2016, 72, 171−179. (2) Jeffrey, G. A. Molecular Structures and Dimensions: Guide to the Literature, 1935−76: Organic and Organometallic Crystal Structures Edited by O. Kennard, F. H. Allen and D. G. Watson. Acta Crystallogr., Sect. B: Struct. Crystallogr. Cryst. Chem. 1978, 34, 3847. (3) Kennard, O. From Private Data to Public Knowledge. In The impact of electronic publishing on the academic community; Butterworth, I., Ed.; Portland Press: London, 1997; pp 159−166. (4) Moore’s law: the number of transistors on a microchip doubles every two years while the cost of computers is halved; in other words, computers ineluctably become more powerful. (5) We are unable to comment on the accuracy of this claim, which apparently refers to a story Twain wrote called “From the ‘London Times’ of 1904”. (6) Allen, F. H.; Motherwell, W. D. S. Applications of the Cambridge Structural Database in Organic Chemistry and Crystal Chemistry. Acta Crystallogr., Sect. B: Struct. Sci. 2002, 58, 407−422. (7) Orpen, A. G. Applications of the Cambridge Structural Database to Molecular Inorganic Chemistry. Acta Crystallogr., Sect. B: Struct. Sci. 2002, 58, 398−406. (8) Taylor, R. Life-Science Applications of the Cambridge Structural Database. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2002, 58, 879− 888. (9) Bauer, J.; Spanton, S.; Henry, R.; Quick, J.; Dziki, W.; Porter, W.; Morris, J. Ritonavir: An Extraordinary Example of Conformational Polymorphism. Pharm. Res. 2001, 18, 859−866. AK

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

and Database Study. Acta Crystallogr., Sect. B: Struct. Sci. 2011, 67, 94−102. (30) Cruz-Cabeza, A. J.; Allen, F. H. Geometry and Conformation of Cyclopropane Derivatives Having σ-Acceptor and σ-Donor Substituents: A Theoretical and Crystal Structure Database Study. Acta Crystallogr., Sect. B: Struct. Sci. 2012, 68, 182−188. (31) Berman, H.; Henrick, K.; Nakamura, H. Announcing the Worldwide Protein Data Bank. Nat. Struct. Mol. Biol. 2003, 10, 980. (32) Raghavender, U. S. Analysis of Residue Conformations in Peptides in Cambridge Structural Database and Protein-Peptide Structural Complexes. Chem. Biol. Drug Des. 2017, 89, 428−442. (33) Murray-Rust, P.; Motherwell, S. Computer Retrieval and Analysis of Molecular Geometry. III. Geometry of the β- l’Aminofuranoside Fragment. Acta Crystallogr., Sect. B: Struct. Crystallogr. Cryst. Chem. 1978, 34, 2534−2546. (34) Taylor, R. The Cambridge Structural Database in Molecular Graphics: Techniques for the Rapid Identification of Conformational Minima. J. Mol. Graphics 1986, 4, 123−131. (35) Parkin, A. Uses of the dSNAP Cluster Analysis Software for Studying Geometric Information Extracted from the Cambridge Structural Database. Crystallogr. Rev. 2008, 14, 117−141. (36) Parkin, A.; Collins, A.; Gilmore, C. J.; Wilson, C. C. Using Small Molecule Crystal Structure Data to Obtain Information about Sulfonamide Conformation. Acta Crystallogr., Sect. B: Struct. Sci. 2008, 64, 66−71. (37) Bürgi, H. B.; Dunitz, J. D. Can Statistical Analysis of Structural Parameters from Different Crystal Environments Lead to Quantitative Energy Relationships? Acta Crystallogr., Sect. B: Struct. Sci. 1988, 44, 445−448. (38) Allen, F. H.; Kennard, O.; Watson, D. G.; Brammer, L.; Orpen, A. G.; Taylor, R. Tables of Bond Lengths Determined by X-Ray and Neutron Diffraction. Part 1. Bond Lengths in Organic Compounds. J. Chem. Soc., Perkin Trans. 2 1987, S1−S19. (39) Orpen, A. G.; Brammer, L.; Allen, F. H.; Kennard, O.; Watson, D. G.; Taylor, R. Tables of Bond Lengths Determined by X-Ray and Neutron Diffraction. Part 2. Organometallic Compounds and Coordination Complexes of the d- and f-Block Metals. J. Chem. Soc., Dalton Trans. 1989, S1−S83. (40) Engh, R. A.; Huber, R. Accurate Bond and Angle Parameters for X-ray Protein Structure Refinement. Acta Crystallogr., Sect. A: Found. Crystallogr. 1991, 47, 392−400. (41) Allen, F. H.; Bruno, I. J. Bond Lengths in Organic and MetalOrganic Compounds Revisited: X - H Bond Lengths from Neutron Diffraction Data. Acta Crystallogr., Sect. B: Struct. Sci. 2010, 66, 380− 386. (42) Arnautova, Y. A.; Abagyan, R.; Totrov, M. All-Atom Internal Coordinate Mechanics (ICM) Force Field for Hexopyranoses and Glycoproteins. J. Chem. Theory Comput. 2015, 11, 2167−2186. (43) Bruno, I. J.; Cole, J. C.; Kessler, M.; Luo, J.; Motherwell, W. D. S.; Purkis, L. H.; Smith, B. R.; Taylor, R.; Cooper, R. I.; Harris, S. E.; et al. Retrieval of Crystallographically-Derived Molecular Geometry Information. J. Chem. Inf. Comput. Sci. 2004, 44, 2133−2144. (44) Cottrell, S. J.; Olsson, T. S. G.; Taylor, R.; Cole, J. C.; Liebeschuetz, J. W. Validating and Understanding Ring Conformations Using Small Molecule Crystallographic Data. J. Chem. Inf. Model. 2012, 52, 956−962. (45) Agirre, J. Strategies for Carbohydrate Model Building, Refinement and Validation. Acta Crystallogr. Sect. D Struct. Biol. 2017, 73, 171−186. (46) Smart, O. S.; Holstein, J.; Womack, T. Grade Documentation, Version 1.2.9. https://www.globalphasing.com/buster/manual/ grade/manual/index.html (accessed Dec 10, 2018). (47) Sen, S.; Young, J.; Berrisford, J. M.; Chen, M.; Conroy, M. J.; Dutta, S.; Di Costanzo, L.; Gao, G.; Ghosh, S.; Hudson, B. P. Small Molecule Annotation for the Protein Data Bank. Database 2014, 2014, bau006. (48) Steiner, R. A.; Tucker, J. A. Keep It Together: Restraints in Crystallographic Refinement of Macromolecule-Ligand Complexes. Acta Crystallogr. Sect. D Struct. Biol. 2017, 73, 93−102.

(49) Florence, A. J.; Bardin, J.; Johnston, B.; Shankland, N.; Griffin, T. A. N.; Shankland, K. Structure Determination from Powder Data: Mogul and CASTEP. Z. Krist. Suppl. 2009, 2009, 215−220. (50) Ferreira, F. F.; Trindade, A. C.; Antonio, S. G.; de Oliveira Paiva-Santos, C. Crystal Structure of Propylthiouracil Determined Using High-Resolution Synchrotron X-Ray Powder Diffraction. CrystEngComm 2011, 13, 5474−5479. (51) Safina, B. S.; Elliott, R. L.; Forrest, A. K.; Heald, R. A.; Murray, J. M.; Nonomiya, J.; Pang, J.; Salphati, L.; Seward, E. M.; Staben, S. T.; et al. Design of Selective Benzoxazepin PI3Kδ Inhibitors Through Control of Dihedral Angles. ACS Med. Chem. Lett. 2017, 8, 936−940. (52) Iuzzolino, L.; Reilly, A. M.; McCabe, P.; Price, S. L. Use of Crystal Structure Informatics for Defining the Conformational Space Needed for Predicting Crystal Structures of Pharmaceutical Molecules. J. Chem. Theory Comput. 2017, 13, 5163−5171. (53) Bax, B.; Chung, C. W.; Edge, C. Getting the Chemistry Right: Protonation, Tautomers and the Importance of H Atoms in Biological Chemistry. Acta Crystallogr. Sect. D Struct. Biol. 2017, 73, 131−140. (54) McCabe, P.; Korb, O.; Cole, J. Kernel Density Estimation Applied to Bond Length, Bond Angle, and Torsion Angle Distributions. J. Chem. Inf. Model. 2014, 54, 1284−1288. (55) Cole, J. C.; Groom, C. R.; Korb, O.; McCabe, P.; Shields, G. P. Knowledge-Based Optimization of Molecular Geometries Using Crystal Structures. J. Chem. Inf. Model. 2016, 56, 652−661. (56) Taylor, R.; Cole, J.; Korb, O.; McCabe, P. Knowledge-Based Libraries for Predicting the Geometric Preferences of Druglike Molecules. J. Chem. Inf. Model. 2014, 54, 2500−2514. (57) Cole, J. C.; Korb, O.; McCabe, P.; Read, M. G.; Taylor, R. Knowledge-Based Conformer Generation Using the Cambridge Structural Database. J. Chem. Inf. Model. 2018, 58, 615−629. (58) Schärfer, C.; Schulz-Gasch, T.; Ehrlich, H. C.; Guba, W.; Rarey, M.; Stahl, M. Torsion Angle Preferences in Druglike Chemical Space: A Comprehensive Guide. J. Med. Chem. 2013, 56, 2016−2028. (59) Guba, W.; Meyder, A.; Rarey, M.; Hert, J. Torsion Library Reloaded: A New Version of Expert-Derived SMARTS Rules for Assessing Conformations of Small Molecules. J. Chem. Inf. Model. 2016, 56, 1−5. (60) Sadowski, J.; Boström, J. MIMUMBA Revisited: Torsion Angle Rules for Conformer Generation Derived from X-Ray Structures. J. Chem. Inf. Model. 2006, 46, 2305−2309. (61) Kothiwale, S.; Mendenhall, J. L.; Meiler, J. BCL :: CONF : Small Molecule Conformational Sampling Using a Knowledge Based Rotamer Library. J. Cheminf. 2015, 7, 47. (62) Weng, Z. F.; Motherwell, W. D. S.; Allen, F. H.; Cole, J. M. Conformational Variability of Molecules in Different Crystal Environments: A Database Study. Acta Crystallogr., Sect. B: Struct. Sci. 2008, 64, 348−362. (63) Thompson, H. P. G.; Day, G. M. Which Conformations Make Stable Crystal Structures? Mapping Crystalline Molecular Geometries to the Conformational Energy Landscape. Chem. Sci. 2014, 5, 3173− 3182. (64) Cruz-Cabeza, A. J.; Liebeschuetz, J. W.; Allen, F. H. Systematic Conformational Bias in Small-Molecule Crystal Structures Is Rare and Explicable. CrystEngComm 2012, 14, 6797−6811. (65) Brock, C. P.; Minton, R. P. Systematic Effects of CrystalPacking Forces: Biphenyl Fragments with Hydrogen Atoms in All Four Ortho Positions. J. Am. Chem. Soc. 1989, 111, 4586−4593. (66) Pascal, R. A., Jr.; Wang, C. M.; Wang, G. C.; Koplitz, L. V. Ideal Molecular Conformation versus Crystal Site Symmetry. Cryst. Growth Des. 2012, 12, 4367−4376. (67) Back, K. R.; Davey, R. J.; Grecu, T.; Hunter, C. A.; Taylor, L. S. Molecular Conformation and Crystallization: The Case of Ethenzamide. Cryst. Growth Des. 2012, 12, 6110−6117. (68) Wriedt, M.; Näther, C. Synthesis, Crystal Structure, Thermal, Spectroscopic and Magnetic Properties of a 2D Grid-like Copper(II) μ-1,1 Isocyanato Coordination Polymer with Pyrazine Bridges. Z. Anorg. Allg. Chem. 2009, 635, 1115−1122. AL

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(69) Dudev, M.; Wang, J.; Dudev, T.; Lim, C. Factors Governing the Metal Coordination Number in Metal Complexes from Cambridge Structural Database Analyses. J. Phys. Chem. B 2006, 110, 1889−1895. (70) Tunell, I.; Lim, C. Factors Governing the Metal Coordination Number in Isolated Group IA and IIA Metal Hydrates. Inorg. Chem. 2006, 45, 4811−4819. (71) Kuppuraj, G.; Dudev, M.; Lim, C. Factors Governing MetalLigand Distances and Coordination Geometries of Metal Complexes. J. Phys. Chem. B 2009, 113, 2952−2960. (72) See, R. F.; Kozina, D. Quantification of the Trans Influence in d8 Square Planar and d6 Octahedral Complexes: A Database Study. J. Coord. Chem. 2013, 66, 490−500. (73) Nimmermark, A.; Ö hrström, L.; Reedijk, J. Metal-Ligand Bond Lengths and Strengths: Are They Correlated? A Detailed CSD Analysis. Z. Kristallogr. - Cryst. Mater. 2013, 228, 311−317. (74) Holland, P. L. Metal-Dioxygen and Metal-Dinitrogen Complexes: Where Are the Electrons? Dalt. Trans. 2010, 39, 5415− 5425. (75) Alvarez, S.; Alemany, P.; Casanova, D.; Cirera, J.; Llunell, M.; Avnir, D. Shape Maps and Polyhedral Interconversion Paths in Transition Metal Chemistry. Coord. Chem. Rev. 2005, 249, 1693− 1708. (76) Ruiz-Martínez, A.; Casanova, D.; Alvarez, S. Polyhedral Structures with an Odd Number of Vertices : Nine-Coordinate Metal Compounds. Chem. - Eur. J. 2008, 14, 1291−1303. (77) Echeverría, J.; Cremades, E.; Amoroso, A. J.; Alvarez, S. Jahn− Teller Distortions of Six-Coordinate CuII Compounds: Cis or Trans? Chem. Commun. 2009, 4242−4244. (78) Ruiz-Martínez, A.; Alvarez, S. Stereochemistry of Compounds with Coordination Number Ten. Chem. - Eur. J. 2009, 15, 7470− 7480. (79) Alvarez, S.; Menjón, B.; Falceto, A.; Casanova, D.; Alemany, P. Stereochemistry of Complexes with Double and Triple Metal-Ligand Bonds: A Continuous Shape Measures Analysis. Inorg. Chem. 2014, 53, 12151−12163. (80) Alvarez, S. Distortion Pathways of Transition Metal Coordination Polyhedra Induced by Chelating Topology. Chem. Rev. 2015, 115, 13447−13483. (81) Zabrodsky, H.; Peleg, S.; Avnir, D. Continuous Symmetry Measures. 2. Symmetry Groups and the Tetrahedron. J. Am. Chem. Soc. 1993, 115, 8278−8289. (82) Davis, T. L.; Watts, J. L.; Brown, K. J.; Hewage, J. S.; Treleven, A. R.; Lindeman, S. V.; Gardinier, J. R. Structural Classification of Metal Complexes with Three-Coordinate Centres. Dalt. Trans. 2015, 44, 15408−15412. (83) Shatruk, M.; Phan, H.; Chrisostomo, B. A.; Suleimenova, A. Symmetry-Breaking Structural Phase Transitions in Spin Crossover Complexes. Coord. Chem. Rev. 2015, 289, 62−73. (84) Niksch, T.; Görls, H.; Weigand, W. The Extension of the SolidAngle Concept to Bidentate Ligands. Eur. J. Inorg. Chem. 2010, 2010, 95−105. (85) Nesterov, D. S.; Kokozay, V. N.; Skelton, B. W. A Pentanuclear Cu/Co/Ni Complex with 2-(Dimethylamino)ethanol - Observation of a Rare Molecular Structure Type and Its Place in General Structural Types: An Analysis of the Cambridge Structural Database. Eur. J. Inorg. Chem. 2009, 2009, 5469−5473. (86) Sarma, J. A. R. P.; Nangia, A.; Desiraju, G. R.; Zass, E.; Dunitz, J. D. Even-Odd Carbon Atom Disparity. Nature 1996, 384, 320. (87) Radhakrishnan, T. P. Is the Dominance of Even Carbon Atom Molecules Odd? J. Chem. Inf. Comput. Sci. 2000, 40, 40−43. (88) Novikov, A. S. Strong Metallophilic Interactions in Nickel Coordination Compounds. Inorg. Chim. Acta 2018, 483, 21−25. (89) Isele, K.; Gigon, F.; Williams, A. F.; Bernardinelli, G.; Franz, P.; Decurtins, S. Synthesis, Structure and Properties of {M4O4} Cubanes Containing Nickel(II) and Cobalt(II). Dalt. Trans. 2007, 332−341. (90) Parmelee, S. R.; Mankad, N. P. A Data-Intensive Re-evaluation of Semibridging Carbonyl Ligands. Dalt. Trans. 2015, 44, 17007− 17014.

(91) Venegas-Yazigi, D.; Aravena, D.; Spodine, E.; Ruiz, E.; Alvarez, S. Structural and Electronic Effects on the Exchange Interactions in Dinuclear Bis(phenoxo)-Bridged Copper(II) Complexes. Coord. Chem. Rev. 2010, 254, 2086−2095. (92) Simas, A. M.; Freire, R. O.; Rocha, G. B. Lanthanide Coordination Compounds Modeling: Sparkle/PM3 Parameters for Dysprosium (III), Holmium (III) and Erbium (III). J. Organomet. Chem. 2008, 693, 1952−1956. (93) Zheng, H.; Langner, K. M.; Shields, G. P.; Hou, J.; Kowiel, M.; Allen, F. H.; Murshudov, G.; Minor, W. Data Mining of Iron(II) and Iron(III) Bond-Valence Parameters, and Their Relevance for Macromolecular Crystallography. Acta Crystallogr. Sect. D Struct. Biol. 2017, 73, 316−325. (94) Yee, T. A.; Suescun, L.; Rabuffetti, F. A. Bond Valence Parameters for Alkali- and Alkaline-Earth- Oxygen Pairs: Derivation and Application to Metal-Organic Compounds. J. Solid State Chem. 2019, 270, 242−246. (95) Maloney, A. G. P.; Wood, P. A.; Parsons, S. Intermolecular Interaction Energies in Transition Metal Coordination Compounds. CrystEngComm 2015, 17, 9300−9310. (96) Gavezzotti, A. Calculations of Lattice Energies of Organic Crystals: The PIXEL Integration Method in Comparison with More Traditional Methods. Z. Kristallogr. - Cryst. Mater. 2005, 220, 499− 510. (97) Foscato, M.; Occhipinti, G.; Venkatraman, V.; Alsberg, B. K.; Jensen, V. R. Automated Design of Realistic Organometallic Molecules from Fragments. J. Chem. Inf. Model. 2014, 54, 767−780. (98) Wicker, J. G. P.; Cooper, R. I. Will It Crystallise? Predicting Crystallinity of Molecular Materials. CrystEngComm 2015, 17, 1927− 1934. (99) Wicker, J. G. P.; Cooper, R. I. Beyond Rotatable Bond Counts: Capturing 3D Conformational Flexibility in a Single Descriptor. J. Chem. Inf. Model. 2016, 56, 2347−2352. (100) Henry, R. F. The Effects of Tautomerism on the Nature of Molecules in the Solid State. J. Comput.-Aided Mol. Des. 2010, 24, 587−590. (101) Babu Nanubolu, J.; Sridhar, B.; Ravikumar, K. Understanding the Amino ↔ Imino Tautomeric Preference in (Imidazole)imidazolidine-N-aryl(alkyl) Systems: A Case Study of Moxonidine Drug and Insights from the Cambridge Structural Database (CSD). CrystEngComm 2014, 16, 10602−10617. (102) Cruz-Cabeza, A. J.; Schreyer, A.; Pitt, W. R. Annular Tautomerism: Experimental Observations and Quantum Mechanics Calculations. J. Comput.-Aided Mol. Des. 2010, 24, 575−586. (103) Cruz-Cabeza, A. J.; Groom, C. R. Identification, Classification and Relative Stability of Tautomers in the Cambridge Structural Database. CrystEngComm 2011, 13, 93−98. (104) Milletti, F.; Vulpetti, A. Tautomer Preference in PDB Complexes and Its Impact on Structure-Based Drug Discovery. J. Chem. Inf. Model. 2010, 50, 1062−1074. (105) Cruz-Cabeza, A. J. Acid-Base Crystalline Complexes and the pKa Rule. CrystEngComm 2012, 14, 6362−6365. (106) Mukherjee, A.; Desiraju, G. R. Combinatorial Exploration of the Structural Landscape of Acid-Pyridine Cocrystals. Cryst. Growth Des. 2014, 14, 1375−1385. (107) Bacchi, A. The Use of Databases in the Study of Intermolecular Interactions. In Intermolecular Interactions in Crystals: Fundamentals of Crystal Engineering; Novoa, J. J., Ed.; Royal Society of Chemistry: London, 2017; pp 350−373. (108) Galek, P. T. A.; Chisholm, J. A.; Pidcock, E.; Wood, P. A. Hydrogen-Bond Coordination in Organic Crystal Structures: Statistics, Predictions and Applications. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2014, 70, 91−105. (109) Wood, P. A.; Galek, P. T. A. The Impact of Accessible Surface on Hydrogen Bond Formation. CrystEngComm 2010, 12, 2485−2491. (110) Wood, P. A.; Pidcock, E.; Allen, F. H. Interaction Geometries and Energies of Hydrogen Bonds to C = O and C = S Acceptors: A Comparative Study. Acta Crystallogr., Sect. B: Struct. Sci. 2008, 64, 491−496. AM

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Study. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2013, 69, 281−287. (131) Łukomska, M.; Rybarczyk-Pirek, A. J.; Jabłoński, M.; Palusiak, M. The Nature of NO-Bonding in N-Oxide Group. Phys. Chem. Chem. Phys. 2015, 17, 16375−16387. (132) Alkorta, I.; Elguero, J.; Elguero, E. Nitroxide Stable Radicals Interacting as Lewis Bases in Hydrogen Bonds: A Search in the Cambridge Structural Data Base for Intermolecular Contacts. J. Mol. Struct. 2017, 1148, 150−161. (133) Wood, P. A.; Allen, F. H.; Pidcock, E. Hydrogen-Bond Directionality at the Donor H Atom - Analysis of Interaction Energies and Database Statistics. CrystEngComm 2009, 11, 1563−1571. (134) Veljković, D. Ž .; Janjić, G. V.; Zarić, S. D. Are C-H···O Interactions Linear? The Case of Aromatic CH Donors. CrystEngComm 2011, 13, 5005−5010. (135) Dragelj, J. L.; Janjić, G. V.; Veljković, D. Ž .; Zarić, S. D. Crystallographic and Ab Initio Study of Pyridine CH-O Interactions: Linearity of the Interactions and Influence of Pyridine Classical Hydrogen Bonds. CrystEngComm 2013, 15, 10481−10489. (136) Murray-Rust, P.; Motherwell, W. D. S. Computer Retrieval and Analysis of Molecular Geometry. 4. Intermolecular Interactions. J. Am. Chem. Soc. 1979, 101, 4374−4376. (137) Ramasubbu, N.; Parthasarathy, R.; Murray-Rust, P. Angular Preferences of Intermolecular Forces around Halogens Centers: Preferred Directions of Approach of Electrophiles and Nucleophiles around Carbon-Halogen Bond. J. Am. Chem. Soc. 1986, 108, 4308− 4314. (138) Metrangolo, P.; Resnati, G. Halogen Bonding: A Paradigm in Supramolecular Chemistry. Chem. - Eur. J. 2001, 7, 2511−2519. (139) Metrangolo, P.; Neukirch, H.; Pilati, T.; Resnati, G. Halogen Bonding Based Recognition Processes: A World Parallel to Hydrogen Bonding. Acc. Chem. Res. 2005, 38, 386−395. (140) Cavallo, G.; Metrangolo, P.; Milani, R.; Pilati, T.; Priimagi, A.; Resnati, G.; Terraneo, G. The Halogen Bond. Chem. Rev. 2016, 116, 2478−2601. (141) Clark, T.; Hennemann, M.; Murray, J. S.; Politzer, P. Halogen Bonding: The σ-Hole. J. Mol. Model. 2007, 13, 291−296. (142) Desiraju, G. R.; Parthasarathy, R. The Nature of Halogen··· Halogen Interactions: Are Short Halogen Contacts Due to Specific Attractive Forces or Due to Close Packing of Nonspherical Atoms? J. Am. Chem. Soc. 1989, 111, 8725−8726. (143) Pedireddi, V. R.; Reddy, D. S.; Goud, B. S.; Craig, D. C.; Rae, D.; Desiraju, G. R. The Nature of Halogen···Halogen Interactions and the Crystal Structure of 1,3,5,7-Tetraiodoadamantane. J. Chem. Soc., Perkin Trans. 2 1994, 2353−2360. (144) Price, S. L.; Stone, A. J.; Lucas, J.; Rowland, R. S.; Thornley, A. E. The Nature of -Cl···Cl- Intermolecular Interactions. J. Am. Chem. Soc. 1994, 116, 4910−4918. (145) Capdevila-Cortada, M.; Novoa, J. J. The Nature of the C-Br··· Br-C Intermolecular Interactions Found in Molecular Crystals: A General Theoretical-Database Study. CrystEngComm 2015, 17, 3354− 3365. (146) Mukherjee, A.; Tothadi, S.; Desiraju, G. R. Halogen Bonds in Crystal Engineering: Like Hydrogen Bonds yet Different. Acc. Chem. Res. 2014, 47, 2514−2524. (147) Saha, B. K.; Rather, S. A.; Saha, A. Interhalogen Interactions in the Light of Geometrical Correction. Cryst. Growth Des. 2016, 16, 3059−3062. (148) Mooibroek, T. J.; Gamez, P. Halogen Bonding versus Hydrogen Bonding: What Does the Cambridge Database Reveal? CrystEngComm 2013, 15, 4565−4570. (149) Kroon, J.; Kanters, J. A. Non-Linearity of Hydrogen Bonds in Molecular Crystals. Nature 1974, 248, 667−669. (150) Reichenbächer, K.; Süss, H. I.; Hulliger, J. Fluorine in Crystal Engineering - “The Little Atom That Could”. Chem. Soc. Rev. 2005, 34, 22−30. (151) Merz, K.; Vasylyeva, V. Development and Boundaries in the Field of Supramolecular Synthons. CrystEngComm 2010, 12, 3989− 4002.

(111) Lenthall, J. T.; Foster, J. A.; Anderson, K. M.; Probert, M. R.; Howard, J. A. K.; Steed, J. W. Hydrogen Bonding Interactions with the Thiocarbonyl π-System. CrystEngComm 2011, 13, 3202−3212. (112) Corpinot, M. K.; Guo, R.; Tocher, D. A.; Buanz, A. B. M.; Gaisford, S.; Price, S. L.; Bučar, D. K. Are Oxygen and Sulfur Atoms Structurally Equivalent in Organic Crystals? Cryst. Growth Des. 2017, 17, 827−833. (113) Bibelayi, D.; Lundemba, A. S.; Allen, F. H.; Galek, P. T. A.; Pradon, J.; Reilly, A. M.; Groom, C. R.; Yav, Z. G. Hydrogen Bonding at C = Se Acceptors in Selenoureas, Selenoamides and Selones. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2016, 72, 317−325. (114) Andrić, J. M.; Janjić, G. V.; Ninković, D. B.; Zarić, S. D. The Influence of Water Molecule Coordination to a Metal Ion on Water Hydrogen Bonds. Phys. Chem. Chem. Phys. 2012, 14, 10896−10898. (115) Andrić, J. M.; Misini-Ignjatović, M. Z.; Murray, J. S.; Politzer, P.; Zarić, S. D. Hydrogen Bonding between Metal-Ion Complexes and Noncoordinated Water: Electrostatic Potentials and Interaction Energies. ChemPhysChem 2016, 17, 2035−2042. (116) Dunitz, J. D. Organic Fluorine: Odd Man Out. ChemBioChem 2004, 5, 614−621. (117) Taylor, R. The Hydrogen Bond between N-H or O-H and Organic Fluorine: Favourable Yes, Competitive No. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2017, 73, 474−488. (118) Dalvit, C.; Vulpetti, A. Intermolecular and Intramolecular Hydrogen Bonds Involving Fluorine Atoms: Implications for Recognition, Selectivity, and Chemical Properties. ChemMedChem 2012, 7, 262−272. (119) Arunan, E.; Desiraju, G. R.; Klein, R. A.; Sadlej, J.; Scheiner, S.; Alkorta, I.; Clary, D. C.; Crabtree, R. H.; Dannenberg, J. J.; Hobza, P.; et al. Definition of the Hydrogen Bond (IUPAC Recommendations 2011). Pure Appl. Chem. 2011, 83, 1637−1641. (120) Thalladi, V. R.; Weiss, H. C.; Bläser, D.; Boese, R.; Nangia, A.; Desiraju, G. R. C-H···F Interactions in the Crystal Structures of Some Fluorobenzenes. J. Am. Chem. Soc. 1998, 120, 8702−8710. (121) Thakur, T. S.; Kirchner, M. T.; Bläser, D.; Boese, R.; Desiraju, G. R. C-H···F-C Hydrogen Bonding in 1,2,3,5-Tetrafluorobenzene and Other Fluoroaromatic Compounds and the Crystal Structure of Alloxan Revisited. CrystEngComm 2010, 12, 2079−2085. (122) Panini, P.; Chopra, D. Role of Intermolecular Interactions Involving Organic Fluorine in Trifluoromethylated Benzanilides. CrystEngComm 2012, 14, 1972−1989. (123) D’Oria, E.; Novoa, J. J. On the Hydrogen Bond Nature of the C-H···F Interactions in Molecular Crystals. An Exhaustive Investigation Combining a Crystallographic Database Search and Ab Initio Theoretical Calculations. CrystEngComm 2008, 10, 423−436. (124) Shukla, R.; Chopra, D. Crystallographic and Computational Investigation of Intermolecular Interactions Involving Organic Fluorine with Relevance to the Hybridization of the Carbon Atom. CrystEngComm 2015, 17, 3596−3609. (125) Gavezzotti, A.; Lo Presti, L. Building Blocks of Crystal Engineering: A Large-Database Study of the Intermolecular Approach between C-H Donor Groups and O, N, Cl, or F Acceptors in Organic Crystals. Cryst. Growth Des. 2016, 16, 2952−2962. (126) Taylor, R. It Isn’t, It Is: The C-H···X (X = O, N, F, Cl) Interaction Really Is Significant in Crystal Packing. Cryst. Growth Des. 2016, 16, 4165−4168. (127) Lo Presti, L. On the Significance of Weak Hydrogen Bonds in Crystal Packing: A Large Databank Comparison of Polymorphic Structures. CrystEngComm 2018, 20, 5976−5989. (128) McKenzie, J.; Feeder, N.; Hunter, C. A. H-Bond Competition Experiments in Solution and the Solid State. CrystEngComm 2016, 18, 394−397. (129) McKenzie, J.; Hunter, C. A. Competitor Analysis of Functional Group H-Bond Donor and Acceptor Properties Using the Cambridge Structural Database. Phys. Chem. Chem. Phys. 2018, 20, 25324−25334. (130) Allen, F. H.; Wood, P. A.; Galek, P. T. A. The Versatile Role of the Ethynyl Group in Crystal Packing: An Interaction Propensity AN

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Intramolecular S···O Chalcogen Bonding. Cryst. Growth Des. 2015, 15, 2110−2118. (174) Beno, B. R.; Yeung, K. S.; Bartberger, M. D.; Pennington, L. D.; Meanwell, N. A. A Survey of the Role of Noncovalent Sulfur Interactions in Drug Design. J. Med. Chem. 2015, 58, 4383−4438. (175) Taylor, R. Progress in the Understanding of Traditional and Nontraditional Molecular Interactions. In Comprehensive Medicinal Chemistry III; Chackalamannil, S., Rotella, D., Ward, S., Eds.; Elsevier Inc.: Amsterdam, 2017; Vol. 3, pp 67−100. (176) Nayak, S. K.; Kumar, V.; Murray, J. S.; Politzer, P.; Terraneo, G.; Pilati, T.; Metrangolo, P.; Resnati, G. Fluorination Promotes Chalcogen Bonding in Crystalline Solids. CrystEngComm 2017, 19, 4955−4959. (177) Cozzolino, A. F.; Elder, P. J. W.; Vargas-Baca, I. A Survey of Tellurium-Centered Secondary-Bonding Supramolecular Synthons. Coord. Chem. Rev. 2011, 255, 1426−1438. (178) Shukla, R.; Chopra, D. “Pnicogen Bonds” or “Chalcogen Bonds”: Exploiting the Effect of Substitution on the Formation of P··· Se Noncovalent Bonds. Phys. Chem. Chem. Phys. 2016, 18, 13820− 13829. (179) Politzer, P.; Murray, J. S.; Janjić, G. V.; Zarić, S. D. σ-Hole Interactions of Covalently-Bonded Nitrogen, Phosphorus and Arsenic: A Survey of Crystal Structures. Crystals 2014, 4, 12−31. (180) Starbuck, J.; Norman, N. C.; Orpen, A. G. Secondary Bonding as a Potential Design Element for Crystal Engineering. New J. Chem. 1999, 23, 969−972. (181) Cangelosi, V. M.; Pitt, M. A.; Vickaryous, W. J.; Allen, C. A.; Zakharov, L. N.; Johnson, D. W. Design Considerations for the Group 15 Elements: The Pnictogen···π Interaction as a Complementary Component in Supramolecular Assembly Design. Cryst. Growth Des. 2010, 10, 3531−3536. (182) Sarkar, S.; Pavan, M. S.; Guru Row, T. N. Experimental Validation of “Pnicogen Bonding” in Nitrogen by Charge Density Analysis. Phys. Chem. Chem. Phys. 2015, 17, 2330−2334. (183) Politzer, P.; Murray, J. S.; Clark, T. Halogen Bonding and Other σ-Hole Interactions: A Perspective. Phys. Chem. Chem. Phys. 2013, 15, 11178−11189. (184) Bürgi, H. B.; Dunitz, J. D. From Crystal Statics to Chemical Dynamics. Acc. Chem. Res. 1983, 16, 153−161. (185) Thomas, S. P.; Pavan, M. S.; Guru Row, T. N. Experimental Evidence for “Carbon Bonding” in the Solid State from Charge Density Analysis. Chem. Commun. 2014, 50, 49−51. (186) Bauzá, A.; Mooibroek, T. J.; Frontera, A. Influence of Ring Size on the Strength of Carbon Bonding Complexes between Anions and Perfluorocycloalkanes. Phys. Chem. Chem. Phys. 2014, 16, 19192− 19197. (187) Bauzá, A.; Mooibroek, T. J.; Frontera, A. Small Cycloalkane (CN)2C-C(CN)2 Structures Are Highly Directional Non-Covalent Carbon-Bond Donors. Chem. - Eur. J. 2014, 20, 10245−10248. (188) Quiñonero, D. Sigma-Hole Carbon-Bonding Interactions in Carbon-Carbon Double Bonds: An Unnoticed Contact. Phys. Chem. Chem. Phys. 2017, 19, 15530−15540. (189) Servati Gargari, M.; Stilinović, V.; Bauzá, A.; Frontera, A.; McArdle, P.; Van Derveer, D.; Ng, S. W.; Mahmoudi, G. Design of Lead(II) Metal-Organic Frameworks Based on Covalent and Tetrel Bonding. Chem. - Eur. J. 2015, 21, 17951−17958. (190) Liu, M.; Li, Q.; Li, W.; Cheng, J. Tetrel Bonds between PySiX3 and Some Nitrogenated Bases: Hybridization, Substitution, and Cooperativity. J. Mol. Graphics Modell. 2016, 65, 35−42. (191) Grabowski, S. Lewis Acid Properties of Tetrel Tetrafluorides The Coincidence of the σ-Hole Concept with the QTAIM Approach. Crystals 2017, 7, 43. (192) Scilabra, P.; Kumar, V.; Ursini, M.; Resnati, G. Close Contacts Involving Germanium and Tin in Crystal Structures: Experimental Evidence of Tetrel Bonds. J. Mol. Model. 2018, 24, 37. (193) Bauzá, A.; Mooibroek, T. J.; Frontera, A. Tetrel Bonding Interactions. Chem. Rec. 2016, 16, 473−487. (194) Bauzá, A.; Frontera, A. Aerogen Bonding Interaction: A New Supramolecular Force? Angew. Chem., Int. Ed. 2015, 54, 7340−7343.

(152) Esterhuysen, C.; Heßelmann, A.; Clark, T. Trifluoromethyl: An Amphiphilic Noncovalent Bonding Partner. ChemPhysChem 2017, 18, 772−784. (153) Metrangolo, P.; Murray, J. S.; Pilati, T.; Politzer, P.; Resnati, G.; Terraneo, G. Fluorine-Centered Halogen Bonding: A Factor in Recognition Phenomena and Reactivity. Cryst. Growth Des. 2011, 11, 4238−4246. (154) Metrangolo, P.; Murray, J. S.; Pilati, T.; Politzer, P.; Resnati, G.; Terraneo, G. The Fluorine Atom as a Halogen Bond Donor, viz. a Positive Site. CrystEngComm 2011, 13, 6593−6596. (155) Le Questel, J. Y.; Laurence, C.; Graton, J. Halogen-Bond Interactions: A Crystallographic Basicity Scale towards Iodoorganic Compounds. CrystEngComm 2013, 15, 3212−3221. (156) Perkins, C.; Libri, S.; Adams, H.; Brammer, L. Diiodoacetylene: Compact, Strong Ditopic Halogen Bond Donor. CrystEngComm 2012, 14, 3033−3038. (157) Bauzá, A.; Quiñonero, D.; Deyà, P. M.; Frontera, A. Halogen Bonding versus Chalcogen and Pnicogen Bonding: A Combined Cambridge Structural Database and Theoretical Study. CrystEngComm 2013, 15, 3137−3144. (158) Ji, B.; Zhang, Y.; Deng, D.; Wang, W. Improper Halogen Bond in the Crystal Structure. CrystEngComm 2013, 15, 3093−3096. (159) Troff, R. W.; Mäkelä, T.; Topić, F.; Valkonen, A.; Raatikainen, K.; Rissanen, K. Alternative Motifs for Halogen Bonding. Eur. J. Org. Chem. 2013, 2013, 1617−1637. (160) Cinčić, D.; Frišcǐ ć, T.; Jones, W. Experimental and Database Studies of Three-Centered Halogen Bonds with Bifurcated Acceptors Present in Molecular Crystals, Cocrystals and Salts. CrystEngComm 2011, 13, 3224−3231. (161) Ji, B.; Wang, W.; Deng, D.; Zhang, Y. Symmetrical Bifurcated Halogen Bond: Design and Synthesis. Cryst. Growth Des. 2011, 11, 3622−3628. (162) Brammer, L.; Espallargas, G. M.; Libri, S. Combining Metals with Halogen Bonds. CrystEngComm 2008, 10, 1712−1727. (163) Wang, Y.; Wu, W.; Liu, Y.; Lu, Y. Influence of Transition Metal Coordination on Halogen Bonding: CSD Survey and Theoretical Study. Chem. Phys. Lett. 2013, 578, 38−42. (164) Wang, W. Halogen Bond Involving Hypervalent Halogen: CSD Search and Theoretical Study. J. Phys. Chem. A 2011, 115, 9294−9299. (165) Grabowski, S. J. New Type of Halogen Bond: Multivalent Halogen Interacting with π- and σ-Electrons. Molecules 2017, 22, 2150. (166) Bauzá, A.; Quiñonero, D.; Frontera, A. Substituent Effects in Multivalent Halogen Bonding Complexes: A Combined Theoretical and Crystallographic Study. Molecules 2018, 23, 18. (167) Gavezzotti, A. Non-Conventional Bonding between Organic Molecules. The “Halogen Bond” in Crystalline Systems. Mol. Phys. 2008, 106, 1473−1485. (168) Carlucci, L.; Gavezzotti, A. A Quantitative Measure of Halogen Bond Activation in Cocrystallization. Phys. Chem. Chem. Phys. 2017, 19, 18383−18388. (169) Taylor, R. Which Intermolecular Interactions Have a Significant Influence on Crystal Packing? CrystEngComm 2014, 16, 6852−6865. (170) Rosenfield, R. E., Jr.; Parthasarathy, R.; Dunitz, J. D. Directional Preferences of Nonbonded Atomic Contacts with Divalent Sulfur. 1. Electrophiles and Nucleophiles. J. Am. Chem. Soc. 1977, 99, 4860−4862. (171) Iwaoka, M.; Takemoto, S.; Tomoda, S. Statistical and Theoretical Investigations on the Directionality of Nonbonded S··· O Interactions. Implications for Molecular Design and Protein Engineering. J. Am. Chem. Soc. 2002, 124, 10613−10620. (172) Bauzá, A.; Mooibroek, T. J.; Frontera, A. The Bright Future of Unconventional σ/π-Hole Interactions. ChemPhysChem 2015, 16, 2496−2517. (173) Thomas, S. P.; Veccham, S. P. K. P.; Farrugia, L. J.; Guru Row, T. N. “Conformational Simulation” of Sulfamethizole by Molecular Complexation and Insights from Charge Density Analysis: Role of AO

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(195) Allen, F. H.; Baalham, C. A.; Lommerse, J. P. M.; Raithby, P. R. Carbonyl-Carbonyl Interactions Can Be Competitive with Hydrogen Bonds. Acta Crystallogr., Sect. B: Struct. Sci. 1998, 54, 320−329. (196) Sahariah, B.; Sarma, B. K. Relative Orientation of the Carbonyl Groups Determines the Nature of Orbital Interactions in Carbonyl-Carbonyl Short Contacts. Chem. Sci. 2019, 10, 909−917. (197) Wood, P. A.; Borwick, S. J.; Watkin, D. J.; Motherwell, W. D. S.; Allen, F. H. Dipolar C≡N···C≡N Interactions in Organic Crystal Structures: Database Analysis and Calculation of Interaction Energies. Acta Crystallogr., Sect. B: Struct. Sci. 2008, 64, 393−396. (198) Sparkes, H. A.; Raithby, P. R.; Clot, E.; Shields, G. P.; Chisholm, J. A.; Allen, F. H. Carbonyl···Carbonyl Interactions in FirstRow Transition Metal Complexes. CrystEngComm 2006, 8, 563−570. (199) Lee, S.; Mallik, A. B.; Fredrickson, D. C. Dipolar - Dipolar Interactions and the Crystal Packing of Nitriles, Ketones, Aldehydes. and (Csp2)-F Groups. Cryst. Growth Des. 2004, 4, 279−290. (200) Paulini, R.; Müller, K.; Diederich, F. Orthogonal Multipolar Interactions in Structural Chemistry and Biology. Angew. Chem., Int. Ed. 2005, 44, 1788−1805. (201) Bissantz, C.; Kuhn, B.; Stahl, M. A Medicinal Chemist’s Guide to Molecular Interactions. J. Med. Chem. 2010, 53, 5061−5084. (202) Kamer, K. J.; Choudhary, A.; Raines, R. T. Intimate Interactions with Carbonyl Groups: Dipole-Dipole or n →π*? J. Org. Chem. 2013, 78, 2099−2103. (203) Bü rgi, H. B.; Dunitz, J. D.; Lehn, J. M.; Wipff, G. Stereochemistry of Reaction Paths at Carbonyl Centres. Tetrahedron 1974, 30, 1563−1572. (204) Singh, S. K.; Das, A. The n → π* Interaction: A Rapidly Emerging Non-Covalent Interaction. Phys. Chem. Chem. Phys. 2015, 17, 9596−9612. (205) Newberry, R. W.; Raines, R. T. The n→π* Interaction. Acc. Chem. Res. 2017, 50, 1838−1846. (206) Echeverría, J. The n → π* Interaction in Metal Complexes. Chem. Commun. 2018, 54, 3061−3064. (207) Rahim, A.; Saha, P.; Jha, K. K.; Sukumar, N.; Sarma, B. K. Reciprocal Carbonyl-Carbonyl Interactions in Small Molecules and Proteins. Nat. Commun. 2017, 8, 78. (208) Bauzá, A.; Ramis, R.; Frontera, A. A Combined Theoretical and Cambridge Structural Database Study of π-Hole Pnicogen Bonding Complexes between Electron Rich Molecules and Both Nitro Compounds and Inorganic Bromides (YO2Br, Y = N, P, and As). J. Phys. Chem. A 2014, 118, 2827−2834. (209) Báuza, A.; Frontera, A.; Mooibroek, T. J. π-Hole Interactions Involving Nitro Compounds: Directionality of Nitrate Esters. Cryst. Growth Des. 2016, 16, 5520−5524. (210) Bauzá, A.; Mooibroek, T. J.; Frontera, A. Directionality of πHoles in Nitro Compounds. Chem. Commun. 2015, 51, 1491−1493. (211) Bauzá, A.; Frontera, A.; Mooibroek, T. J. NO3− Anions Can Act as Lewis Acid in the Solid State. Nat. Commun. 2017, 8, 14522. (212) Mooibroek, T. J. Coordinated Nitrate Anions Can Be Directional π-Hole Donors in the Solid State: A CSD Study. CrystEngComm 2017, 19, 4485−4488. (213) Sánchez-Sanz, G.; Trujillo, C.; Solimannejad, M.; Alkorta, I.; Elguero, J. Orthogonal Interactions between Nitryl Derivatives and Electron Donors: Pnictogen Bonds. Phys. Chem. Chem. Phys. 2013, 15, 14310−14318. (214) Bauzá, A.; García-Llinás, X.; Frontera, A. Charge-Assisted Triel Bonding Interactions in Solid State Chemistry: A Combined Computational and Crystallographic Study. Chem. Phys. Lett. 2016, 666, 73−78. (215) Frontera, A.; Bauzá, A. Concurrent Aerogen Bonding and Lone Pair/Anion-π Interactions in the Stability of Organoxenon Derivatives: A Combined CSD and Ab Initio Study. Phys. Chem. Chem. Phys. 2017, 19, 30063−30068. (216) Główka, M. L.; Martynowski, D.; Kozłowska, K. Stacking of Six-Membered Aromatic Rings in Crystals. J. Mol. Struct. 1999, 474, 81−89.

(217) Choudhury, R. R.; Chitra, R. Stacking Interaction between Homostacks of Simple Aromatics and the Factors Influencing These Interactions. CrystEngComm 2010, 12, 2113−2121. (218) Ninković, D. B.; Janjić, G. V.; Zarić, S. D. Crystallographic and Ab Initio Study of Pyridine Stacking Interactions. Local Nature of Hydrogen Bond Effect in Stacking Interactions. Cryst. Growth Des. 2012, 12, 1060−1063. (219) Janjić, G. V.; Ninković, D. B.; Zarić, S. D. Influence of Supramolecular Structures in Crystals on Parallel Stacking Interactions between Pyridine Molecules. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2013, 69, 389−394. (220) Geronimo, I.; Singh, N. J.; Kim, K. S. Nature of AnionTemplated π+- π+ Interactions. Phys. Chem. Chem. Phys. 2011, 13, 11841−11845. (221) Malenov, D. P.; Dragelj, J. L.; Janjić, G. V.; Zarić, S. D. Coordinating Benzenes Stack Stronger than Noncoordinating Benzenes, Even at Large Horizontal Displacements. Cryst. Growth Des. 2016, 16, 4169−4172. (222) Janiak, C. A. Critical Account on π-π Stacking in Metal Complexes with Aromatic Nitrogen-Containing Ligands. J. Chem. Soc. Dalt. Trans. 2000, 3885−3896. (223) Semeniuc, R. F.; Reamer, T. J.; Smith, M. D. 8-Quinoline Based Ligands and Their Metallic Derivatives: A Structural and Statistical Investigation of Quinoline π-π Stacking Interactions. New J. Chem. 2010, 34, 439−452. (224) Janjić, G.; Andrić, J.; Kapor, A.; Bugarčić, I. D.; Zarić, S. D. Classification of Stacking Interaction Geometries of Terpyridyl Square-Planar Complexes in Crystal Structures. CrystEngComm 2010, 12, 3773−3779. (225) Sredojević, D. N.; Tomić, Z. D.; Zarić, S. D. Evidence of Chelate-Chelate Stacking Interactions in Crystal Structures of Transition-Metal Complexes. Cryst. Growth Des. 2010, 10, 3901− 3908. (226) Sredojević, D. N.; Vojislavljević, D. Z.; Tomić, Z. D.; Zarić, S. D. Parallel Stacking Interactions in Square-Planar Transition-Metal Complexes Containing Fused Chelate and C6-Aromatic Rings. Acta Crystallogr., Sect. B: Struct. Sci. 2012, 68, 261−265. ̇ (227) Karabıyık, H.; Karabıyık, H.; Iskeleli, N. O. HydrogenBridged Chelate Ring-Assisted π-Stacking Interactions. Acta Crystallogr., Sect. B: Struct. Sci. 2012, 68, 71−79. (228) Petrović, P. V.; Janjić, G. V.; Zarić, S. D. Stacking Interactions between Square-Planar Metal Complexes with 2,2’-Bipyridine Ligands. Analysis of Crystal Structures and Quantum Chemical Calculations. Cryst. Growth Des. 2014, 14, 3880−3889. (229) Blagojević, J. P.; Zarić, S. D. Stacking Interactions of Hydrogen-Bridged Rings-Stronger than the Stacking of Benzene Molecules. Chem. Commun. 2015, 51, 12989−12991. (230) Malenov, D. P.; Janjić, G. V.; Medaković, V. B.; Hall, M. B.; Zarić, S. D. Noncovalent Bonding: Stacking Interactions of Chelate Rings of Transition Metal Complexes. Coord. Chem. Rev. 2017, 345, 318−341. (231) Antonijević, I. S.; Malenov, D. P.; Hall, M. B.; Zarić, S. D. Study of Stacking Interactions between Two Neutral Tetrathiafulvalene Molecules in Cambridge Structural Database Crystal Structures and by Quantum Chemical Calculations. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2019, 75, 1−7. (232) Wheeler, S. E.; Houk, K. N. Substituent Effects in the Benzene Dimer Are Due to Direct Interactions of the Substituents with the Unsubstituted Benzene. J. Am. Chem. Soc. 2008, 130, 10854−10855. (233) Wheeler, S. E. Understanding Substituent Effects in Noncovalent Interactions Involving Aromatic Rings. Acc. Chem. Res. 2013, 46, 1029−1038. (234) Wheeler, S. E.; Bloom, J. W. G. Toward a More Complete Understanding of Noncovalent Interactions Involving Aromatic Rings. J. Phys. Chem. A 2014, 118, 6133−6147. (235) Nishio, M.; Umezawa, Y.; Honda, K.; Tsuboyama, S.; Suezawa, H. CH/π Hydrogen Bonds in Organic and Organometallic Chemistry. CrystEngComm 2009, 11, 1757−1788. AP

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(236) Nishio, M. The CH/π Hydrogen Bond: Implication in Chemistry. J. Mol. Struct. 2012, 1018, 2−7. (237) Mooibroek, T. J.; Gamez, P. How Directional Are D-H··· phenyl Interactions in the Solid State (D = C, N, O)? CrystEngComm 2012, 14, 8462−8467. (238) Escudero, D.; Estarellas, C.; Frontera, A.; Quiñonero, D.; Deyà, P. M. Theoretical and Crystallographic Study of Edge-to-Face Aromatic Interactions between Pyridine Moieties and Benzene. Chem. Phys. Lett. 2009, 468, 280−285. (239) Ostojić, B. D.; Janjić, G. V.; Zarić, S. D. Parallel Alignment of Water and Aryl Rings - Crystallographic and Theoretical Evidence for the Interaction. Chem. Commun. 2008, 6546−6548. (240) Janjić, G. V.; Malkov, S. N.; Ž ivković, M. V.; Zarić, S. D. What Are Preferred Water−aromatic Interactions in Proteins and Crystal Structures of Small Molecules? Phys. Chem. Chem. Phys. 2014, 16, 23549−23553. (241) Swierczynski, D.; Luboradzki, R.; Dolgonos, G.; Lipkowski, J.; Schneider, H. J. Non-Covalent Interactions of Organic Halogen Compounds with Aromatic Systems - Analyses of Crystal Structure Data. Eur. J. Org. Chem. 2005, 2005, 1172−1177. (242) Mooibroek, T. J.; Gamez, P. Halogen···phenyl Supramolecular Interactions in the Solid State: Hydrogen versus Halogen Bonding and Directionality. CrystEngComm 2013, 15, 1802−1805. (243) Matter, H.; Nazaré, M.; Güssregen, S.; Will, D. W.; Schreuder, H.; Bauer, A.; Urmann, M.; Ritter, K.; Wagner, M.; Wehner, V. Evidence for C-Cl/C-Br···π Interactions as an Important Contribution to Protein-Ligand Binding Affinity. Angew. Chem., Int. Ed. 2009, 48, 2911−2916. (244) Alkorta, I.; Rozas, I.; Elguero, J. An Attractive Interaction between the π-Cloud of C6F6 and Electron-Donor Atoms. J. Org. Chem. 1997, 62, 4687−4691. (245) Quiñonero, D.; Garau, C.; Rotger, C.; Frontera, A.; Ballester, P.; Costa, A.; Deyà, P. M. Anion - π Interactions : Do They Exist ? Angew. Chem. 2002, 114, 3539−3542. (246) Quiñonero, D.; Garau, C.; Frontera, A.; Ballester, P.; Costa, A.; Deyà, P. M. Counterintuitive Interaction of Anions with Benzene Derivatives. Chem. Phys. Lett. 2002, 359, 486−492. (247) Hay, B. P.; Bryantsev, V. S. Anion-Arene Adducts: C-H Hydrogen Bonding, Anion-π Interaction, and Carbon Bonding Motifs. Chem. Commun. 2008, 2417−2428. (248) Mooibroek, T. J.; Black, C. A.; Gamez, P.; Reedijk, J. What’s New in the Realm of Anion-π Binding Interactions? Putting the Anion-π Interaction in Perspective. Cryst. Growth Des. 2008, 8, 1082− 1093. (249) Hay, B. P.; Custelcean, R. Anion - π Interactions in Crystal Structures: Commonplace or Extraordinary? Cryst. Growth Des. 2009, 9, 2539−2545. (250) Estarellas, C.; Bauzá, A.; Frontera, A.; Quiñonero, D.; Deyà, P. M. On the Directionality of Anion-π Interactions. Phys. Chem. Chem. Phys. 2011, 13, 5696−5702. (251) Wheeler, S. E.; Houk, K. N. Are Anion/π Interactions Actually a Case of Simple Charge-Dipole Interactions? J. Phys. Chem. A 2010, 114, 8658−8664. (252) Frontera, A.; Gamez, P.; Mascal, M.; Mooibroek, T. J.; Reedijk, J. Putting Anion-π Interactions into Perspective. Angew. Chem., Int. Ed. 2011, 50, 9564−9583. (253) Mooibroek, T. J.; Gamez, P. Anion-Arene and Lone PairArene Interactions Are Directional. CrystEngComm 2012, 14, 1027− 1030. (254) Caracelli, I.; Haiduc, I.; Zukerman-Schpector, J.; Tiekink, E. R. T. Delocalised Antimony(Lone Pair)- and Bismuth(Lone Pair)···π(Arene) Interactions: Supramolecular Assembly and Other Considerations. Coord. Chem. Rev. 2013, 257, 2863−2879. (255) Caracelli, I.; Haiduc, I.; Zukerman-Schpector, J.; Tiekink, E. R. T. M. π(Arene) Interactions for M = Gallium, Indium and Thallium: Influence upon Supramolecular Self-Assembly and Prevalence in Some Proteins. Coord. Chem. Rev. 2014, 281, 50−63. (256) Abraham, S. A.; Jose, D.; Datta, A. Do Cation···π Interactions Always Need to Be 1:1? ChemPhysChem 2012, 13, 695−698.

(257) Estarellas, C.; Frontera, A.; Quiñonero, D.; Deyà, P. M. Can Lone Pair-π and Cation-π Interactions Coexist? A Theoretical Study. Cent. Eur. J. Chem. 2011, 9, 25−34. (258) Gavezzotti, A. The Lines-of-Force Landscape of Interactions between Molecules in Crystals; Cohesive versus Tolerant and ‘Collateral Damage’ Contact. Acta Crystallogr., Sect. B: Struct. Sci. 2010, 66, 396−406. (259) Jelsch, C.; Ejsmont, K.; Huder, L. The Enrichment Ratio of Atomic Contacts in Crystals, an Indicator Derived from the Hirshfeld Surface Analysis. IUCrJ 2014, 1, 119−128. (260) Jelsch, C.; Soudani, S.; Ben Nasr, C. Likelihood of AtomAtom Contacts in Crystal Structures of Halogenated Organic Compounds. IUCrJ 2015, 2, 327−340. (261) Jelsch, C.; Bisseyou, Y. B. M. Atom Interaction Propensities of Oxygenated Chemical Functions in Crystal Packings. IUCrJ 2017, 4, 158−174. (262) Lecomte, C.; Espinosa, E.; Matta, C. F. On Atom-Atom ̀ Short Contact’ Bonding Interactions in Crystals. IUCrJ 2015, 2, 161−163. (263) Lane, J. R.; Contreras-García, J.; Piquemal, J. P.; Miller, B. J.; Kjaergaard, H. G. Are Bond Critical Points Really Critical for Hydrogen Bonding? J. Chem. Theory Comput. 2013, 9, 3263−3266. (264) Foroutan-Nejad, C.; Shahbazian, S.; Marek, R. Toward a Consistent Interpretation of the QTAIM: Tortuous Link between Chemical Bonds, Interactions, and Bond/Line Paths. Chem. - Eur. J. 2014, 20, 10140−10152. (265) Dunitz, J. D. Intermolecular Atom-Atom Bonds in Crystals? IUCrJ 2015, 2, 157−158. (266) Spackman, M. A. How Reliable Are Intermolecular Interaction Energies Estimated from Topological Analysis of Experimental Electron Densities? Cryst. Growth Des. 2015, 15, 5624−5628. (267) Dittrich, B. Is There a Future for Topological Analysis in Experimental Charge-Density Research? Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2017, 73, 325−329. (268) Alhameedi, K.; Karton, A.; Jayatilaka, D.; Thomas, S. P. Bond Orders for Intermolecular Interactions in Crystals: Charge Transfer, Ionicity and the Effect on Intramolecular Bonds. IUCrJ 2018, 5, 635− 646. (269) Mackenzie, C. F.; Spackman, P. R.; Jayatilaka, D.; Spackman, M. A. CrystalExplorer Model Energies and Energy Frameworks: Extension to Metal Coordination Compounds, Organic Salts, Solvates and Open-Shell Systems. IUCrJ 2017, 4, 575−587. (270) Leiserowitz, L. Molecular Packing Modes. Carboxylic Acids. Acta Crystallogr., Sect. B: Struct. Crystallogr. Cryst. Chem. 1976, 32, 775−802. (271) Dance, I.; Scudder, M. The Sextuple Phenyl Embrace, a Ubiquitous Concerted Supramolecular Motif. J. Chem. Soc., Chem. Commun. 1995, 1039−1040. (272) Desiraju, G. R. Supramolecular Synthons in Crystal Engineering - a New Organic Synthesis. Angew. Chem., Int. Ed. Engl. 1995, 34, 2311−2327. (273) Desiraju, G. R.; Vittal, J. J.; Ramanan, A. Crystal Engineering: A Textbook; World Scientific: Singapore, 2011. (274) Desiraju, G. R. Crystal Engineering: From Molecule to Crystal. J. Am. Chem. Soc. 2013, 135, 9952−9967. (275) Corpinot, M. K.; Bučar, D.-K. A Practical Guide to the Design of Molecular Crystals. Cryst. Growth Des. 2019, 19, 1426−1453. (276) D’Ascenzo, L.; Auffinger, P. A Comprehensive Classification and Nomenclature of Carboxyl-Carboxyl(ate) Supramolecular Motifs and Related Catemers: Implications for Biomolecular Systems. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2015, 71, 164−175. (277) Shattock, T. R.; Arora, K. K.; Vishweshwar, P.; Zaworotko, M. J. Hierarchy of Supramolecular Synthons: Persistent Carboxylic Acid · · · Pyridine Hydrogen Bonds in Cocrystals That Also Contain a Hydroxyl Moiety. Cryst. Growth Des. 2008, 8, 4533−4545. (278) Kavuru, P.; Aboarayes, D.; Arora, K. K.; Clarke, H. D.; Kennedy, A.; Marshall, L.; Ong, T. T.; Perman, J.; Pujari, T.; Wojtas, Ł.; et al. Hierarchy of Supramolecular Synthons: Persistent Hydrogen Bonds between Carboxylates and Weakly Acidic Hydroxyl Moieties in Cocrystals of Zwitterions. Cryst. Growth Des. 2010, 10, 3568−3584. AQ

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(279) Duggirala, N. K.; Wood, G. P. F.; Fischer, A.; Wojtas, Ł.; Perry, M. L.; Zaworotko, M. J. Hydrogen Bond Hierarchy: Persistent Phenol···Chloride Hydrogen Bonds in the Presence of Carboxylic Acid Moieties. Cryst. Growth Des. 2015, 15, 4341−4354. (280) Aakeröy, C. B.; Epa, K.; Forbes, S.; Schultheiss, N.; Desper, J. Ranking Relative Hydrogen-Bond Strengths in Hydroxybenzoic Acids for Crystal-Engineering Purposes. Chem. - Eur. J. 2013, 19, 14998− 15003. (281) Moragues-Bartolome, A. M.; Jones, W.; Cruz-Cabeza, A. J. Synthon Preferences in Cocrystals of Cis-Carboxamides:Carboxylic Acids. CrystEngComm 2012, 14, 2552−2559. (282) Custelcean, R. Crystal Engineering with Urea and Thiourea Hydrogen-Bonding Groups. Chem. Commun. 2008, 295−307. (283) Wawrzycka-Gorczyca, I. N-H···S Hydrogen Bonding Motifs in Crystalline Solids of 1,2,4-Triazole-5-Thiones: Application of the Cambridge Structural Database. R 2 2 (8) Ring Motif. J. Struct. Chem. 2014, 55, 520−524. (284) Bauzá, A.; Mooibroek, T. J.; Frontera, A. Towards Design Strategies for Anion−π Interactions in Crystal Engineering. CrystEngComm 2016, 18, 10−23. (285) Brondel, N.; Moynihan, E. J. A.; Lehane, K. N.; Eccles, K. S.; Elcoate, C. J.; Coles, S. J.; Lawrence, S. E.; Maguire, A. R. Does Intermolecular SO···H-C-SO Hydrogen Bonding in Sulfoxides and Sulfones Provide a Robust Supramolecular Synthon in the Solid State? CrystEngComm 2010, 12, 2910−2927. (286) Siddiqui, K. A.; Tiekink, E. R. T. A Supramolecular Synthon Approach to Aid the Discovery of Architectures Sustained by C-H···M Hydrogen Bonds. Chem. Commun. 2013, 49, 8501−8503. (287) Sander, J. R. G.; Bučar, D. K.; Henry, R. F.; Giangiorgi, B. N.; Zhang, G. G. Z.; MacGillivray, L. R. “Masked Synthons” in Crystal Engineering: Insulated Components in Acetaminophen Cocrystal Hydrates. CrystEngComm 2013, 15, 4816−4822. (288) Baburin, I. A.; Blatov, V. A.; Carlucci, L.; Ciani, G.; Proserpio, D. M. Interpenetrated Three-Dimensional Networks of HydrogenBonded Organic Species: A Systematic Analysis of the Cambridge Structural Database. Cryst. Growth Des. 2008, 8, 519−539. (289) Alexandrov, E. V.; Blatov, V. A.; Kochetkov, A. V.; Proserpio, D. M. Underlying Nets in Three-Periodic Coordination Polymers: Topology, Taxonomy and Prediction from a Computer-Aided Analysis of the Cambridge Structural Database. CrystEngComm 2011, 13, 3947−3958. (290) Blatov, V. A.; Shevchenko, A. P.; Proserpio, D. M. Applied Topological Analysis of Crystal Structures with the Program Package ToposPro. Cryst. Growth Des. 2014, 14, 3576−3586. (291) Carlucci, L.; Ciani, G.; Proserpio, D. M.; Mitina, T. G.; Blatov, V. A. Entangled Two-Dimensional Coordination Networks: A General Survey. Chem. Rev. 2014, 114, 7557−7580. (292) Sasaki, T.; Miyata, M.; Sato, H. Helicity and Topological Chirality in Hydrogen-Bonded Supermolecules Characterized by Advanced Graph Set Analysis and Solid-State Vibrational Circular Dichroism Spectroscopy. Cryst. Growth Des. 2018, 18, 4621−4627. (293) Bonneau, C.; O’Keeffe, M.; Proserpio, D. M.; Blatov, V. A.; Batten, S. R.; Bourne, S. A.; Lah, M. S.; Eon, J.-G.; Hyde, S. T.; Wiggin, S. B.; et al. Deconstruction of Crystalline Networks into Underlying Nets: Relevance for Terminology Guidelines and Crystallographic Databases. Cryst. Growth Des. 2018, 18, 3411−3418. (294) Motherwell, W. D. S. Architecture of Packing in Molecular Crystals. CrystEngComm 2017, 19, 6869−6882. (295) Galek, P. T. A. Novel Comparison of Crystal Packing by Moments of Inertia. CrystEngComm 2011, 13, 841−849. (296) Spackman, M. A.; Mckinnon, J. J.; Jayatilaka, D. Electrostatic Potentials Mapped on Hirshfeld Surfaces Provide Direct Insight into Intermolecular Interactions in Crystals. CrystEngComm 2008, 10, 377−388. (297) Collins, A.; Wilson, C. C.; Gilmore, C. J. Comparing Entire Crystal Structures Using Cluster Analysis and Fingerprint Plots. CrystEngComm 2010, 12, 801−809. (298) Motherwell, W. D. S. Molecular Shape and Crystal Packing: A Database Study. CrystEngComm 2010, 12, 3554−3570.

(299) Kaźmierczak, M.; Katrusiak, A. The Most Loose Crystals of Organic Compounds. J. Phys. Chem. C 2013, 117, 1441−1446. (300) Giangreco, I.; Cole, J. C.; Thomas, E. Mining the Cambridge Structural Database for Matched Molecular Crystal Structures: A Systematic Exploration of Isostructurality. Cryst. Growth Des. 2017, 17, 3192−3208. (301) Görbitz, C. H. Hydrophobic Dipeptides: The Final Piece in the Puzzle. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2018, 74, 311−318. (302) Filippini, G.; Gavezzotti, A. A Quantitative Analysis of the Relative Importance of Symmetry Operators in Organic Molecular Crystals. Acta Crystallogr., Sect. B: Struct. Sci. 1992, 48, 230−234. (303) Brock, C. P.; Dunitz, J. D. Towards a Grammar of Crystal Packing. Chem. Mater. 1994, 6, 1118−1127. (304) Yao, J. W.; Cole, J. C.; Pidcock, E.; Allen, F. H.; Howard, J. A. K.; Motherwell, W. D. S. CSDSymmetry: The Definitive Database of Point-Group and Space-Group Symmetry Relationships in SmallMolecule Crystal Structures. Acta Crystallogr., Sect. B: Struct. Sci. 2002, 58, 640−646. (305) Brock, C. P.; Duncan, L. L. Anomalous Space-Group Frequencies for Monoalcohols CnHmOH. Chem. Mater. 1994, 6, 1307−1312. (306) Eppel, S.; Bernstein, J. Statistical Survey of Hydrogen-Bond Motifs in Crystallographic Special Symmetry Positions, and the Influence of Chirality of Molecules in the Crystal on the Formation of Hydrogen-Bond Ring Motifs. Acta Crystallogr., Sect. B: Struct. Sci. 2008, 64, 50−56. (307) Dey, A.; Pidcock, E. The Relevance of Chirality in Space Group Analysis: A Database Study of Common Hydrogen-Bonding Motifs and Their Symmetry Preferences. CrystEngComm 2008, 10, 1258−1264. (308) Taylor, R.; Allen, F. H.; Cole, J. C. Quantifying the Symmetry Preferences of Intermolecular Interactions in Organic Crystal Structures. CrystEngComm 2015, 17, 2651−2666. (309) Fábián, L.; Brock, C. P. A List of Organic Kryptoracemates. Acta Crystallogr., Sect. B: Struct. Sci. 2010, 66, 94−103. (310) Pidcock, E. Achiral Molecules in Non-Centrosymmetric Space Groups. Chem. Commun. 2005, 3457−3459. (311) Dunitz, J. D.; Gavezzotti, A. Proteogenic Amino Acids: Chiral and Racemic Crystal Packings and Stabilities. J. Phys. Chem. B 2012, 116, 6740−6750. (312) Gavezzotti, A.; Rizzato, S. Are Racemic Crystals Favored over Homochiral Crystals by Higher Stability or by Kinetics? Insights from Comparative Studies of Crystalline Stereoisomers. J. Org. Chem. 2014, 79, 4809−4816. (313) Otero-de-la-Roza, A.; Hein, J. E.; Johnson, E. R. Reevaluating the Stability and Prevalence of Conglomerates: Implications for Preferential Crystallization. Cryst. Growth Des. 2016, 16, 6055−6059. (314) Steed, K. M.; Steed, J. W. Packing Problems: High Z’ Crystal Structures and Their Relationship to Cocrystals, Inclusion Compounds, and Polymorphism. Chem. Rev. 2015, 115, 2895−2933. (315) Anderson, K. M.; Probert, M. R.; Goeta, A. E.; Steed, J. W. Size Does Matter - The Contribution of Molecular Volume, Shape and Flexibility to the Formation of Co-Crystals and Structures with Z′ > 1. CrystEngComm 2011, 13, 83−87. (316) Gavezzotti, A. Structure and Energy in Organic Crystals with Two Molecules in the Asymmetric Unit: Causality or Chance? CrystEngComm 2008, 10, 389−398. (317) Anderson, K. M.; Probert, M. R.; Whiteley, C. N.; Rowland, A. M.; Goeta, A. E.; Steed, J. W. Designing Co-Crystals of Pharmaceutically Relevant Compounds That Crystallize with Z’ > 1. Cryst. Growth Des. 2009, 9, 1082−1087. (318) Anderson, K. M.; Goeta, A. E.; Steed, J. W. Supramolecular Synthon Frustration Leads to Crystal Structures with Z’ > 1. Cryst. Growth Des. 2008, 8, 2517−2524. (319) Taylor, R.; Cole, J. C.; Groom, C. R. Molecular Interactions in Crystal Structures with Z′ > 1. Cryst. Growth Des. 2016, 16, 2988− 3001. AR

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(320) Brock, C. P. High-Z′ Structures of Organic Molecules: Their Diversity and Organizing Principles. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2016, 72, 807−821. (321) Bond, A. D. Why Do Crystal Structures Waste Molecular Inversion Symmetry? CrystEngComm 2010, 12, 2492−2500. (322) Cruz-Cabeza, A. J.; Reutzel-Edens, S. M.; Bernstein, J. Facts and Fictions about Polymorphism. Chem. Soc. Rev. 2015, 44, 8619− 8635. (323) Cruz-Cabeza, A. J.; Bernstein, J. Conformational Polymorphism. Chem. Rev. 2014, 114, 2170−2191. (324) Elguero, J. Polymorphism and Desmotropy in Heterocyclic Crystal Structures. Cryst. Growth Des. 2011, 11, 4731−4738. (325) Caira, M. R. Polymorphs of Molecular Crystals. In Comprehensive Supramolecular Chemistry II; Atwood, J. L., Gokel, G. W., Barbour, L., Eds.; Elsevier Inc.: Amsterdam, 2017; Vol. 7, pp 127−160. (326) Galek, P. T. A.; Fábián, L.; Allen, F. H. Persistent Hydrogen Bonding in Polymorphic Crystal Structures. Acta Crystallogr., Sect. B: Struct. Sci. 2009, 65, 68−85. (327) Kersten, K.; Kaur, R.; Matzger, A. Survey and Analysis of Crystal Polymorphism in Organic Structures. IUCrJ 2018, 5, 124− 129. (328) Taylor, C. R.; Day, G. M. Evaluating the Energetic Driving Force for Cocrystal Formation. Cryst. Growth Des. 2018, 18, 892− 904. (329) Gavezzotti, A.; Colombo, V.; Lo Presti, L. Facts and Factors in the Formation and Stability of Binary Crystals. Cryst. Growth Des. 2016, 16, 6095−6104. (330) Colombo, V.; Lo Presti, L.; Gavezzotti, A. Two-Component Organic Crystals without Hydrogen Bonding: Structure and Intermolecular Interactions in Bimolecular Stacking. CrystEngComm 2017, 19, 2413−2423. (331) Kelley, S. P.; Fábián, L.; Brock, C. P. Failures of Fractional Crystallization: Ordered Co-Crystals of Isomers and near Isomers. Acta Crystallogr., Sect. B: Struct. Sci. 2011, 67, 79−93. (332) Clarke, H. D.; Arora, K. K.; Bass, H.; Kavuru, P.; Ong, T. T.; Pujari, T.; Wojtas, L.; Zaworotko, M. J. Structure-Stability Relationships in Cocrystal Hydrates: Does the Promiscuity of Water Make Crystalline Hydrates the Nemesis of Crystal Engineering? Cryst. Growth Des. 2010, 10, 2152−2167. (333) Bajpai, A.; Scott, H. S.; Pham, T.; Chen, K. J.; Space, B.; Lusi, M.; Perry, M. L.; Zaworotko, M. J. Towards an Understanding of the Propensity for Crystalline Hydrate Formation by Molecular Compounds. IUCrJ 2016, 3, 430−439. (334) Siddiqui, K. A. Structural Diversity of Metal-Organic Hydrates: A Crystallographic Structural Database Study. J. Struct. Chem. 2018, 59, 106−113. (335) Mohamed, S.; Li, L. From Serendipity to Supramolecular Design: Assessing the Utility of Computed Crystal Form Landscapes in Inferring the Risks of Crystal Hydration in Carboxylic Acids. CrystEngComm 2018, 20, 6026−6039. (336) Skyner, R. E.; Mitchell, J. B. O.; Groom, C. R. Probing the Average Distribution of Water in Organic Hydrate Crystal Structures with Radial Distribution Functions (RDFs). CrystEngComm 2017, 19, 641−652. (337) Healy, A. M.; Worku, Z. A.; Kumar, D.; Madi, A. M. Pharmaceutical Solvates, Hydrates and Amorphous Forms: A Special Emphasis on Cocrystals. Adv. Drug Delivery Rev. 2017, 117, 25−46. (338) Infantes, L.; Fábián, L.; Motherwell, W. D. S. Organic Crystal Hydrates: What Are the Important Factors for Formation. CrystEngComm 2007, 9, 65−71. (339) Berziņ ̅ s,̌ A.; Zvaniņa, D.; Trimdale, A. Detailed Analysis of Packing Efficiency Allows Rationalization of Solvate Formation Propensity for Selected Structurally Similar Organic Molecules. Cryst. Growth Des. 2018, 18, 2040−2045. (340) Spiteri, L.; Baisch, U.; Vella-Zarb, L. Correlations and Statistical Analysis of Solvent Molecule Hydrogen Bonding-a Case Study of Dimethyl Sulfoxide (DMSO). CrystEngComm 2018, 20, 1291−1303.

(341) Brychczynska, M.; Davey, R. J.; Pidcock, E. A Study of Dimethylsulfoxide Solvates Using the Cambridge Structural Database (CSD). CrystEngComm 2012, 14, 1479−1484. (342) Brychczynska, M.; Davey, R. J.; Pidcock, E. A Study of Methanol Solvates Using the Cambridge Structural Database. New J. Chem. 2008, 32, 1754−1760. (343) Allen, F. H.; Wood, P. A.; Galek, P. T. A. Role of Chloroform and Dichloromethane Solvent Molecules in Crystal Packing: An Interaction Propensity Study. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2013, 69, 379−388. (344) Takieddin, K.; Khimyak, Y. Z.; Fábián, L. Prediction of Hydrate and Solvate Formation Using Statistical Models. Cryst. Growth Des. 2016, 16, 70−81. (345) Xin, D.; Gonnella, N. C.; He, X.; Horspool, K. Solvate Prediction for Pharmaceutical Organic Molecules with Machine Learning. Cryst. Growth Des. 2019, 19, 1903−1911. (346) Böhm, H.-J. The Computer Program LUDI: A New Method for the de Novo Design of Enzyme Inhibitors. J. Comput.-Aided Mol. Des. 1992, 6, 61−78. (347) Kuntz, I. D. Structure-Based Strategies for Drug Design and Discovery. Science 1992, 257, 1078−1082. (348) Diller, D. J.; Merz, K. M., Jr. Can We Separate Active from Inactive Conformations? J. Comput.-Aided Mol. Des. 2002, 16, 105− 112. (349) Chen, I. J.; Foloppe, N. Drug-like Bioactive Structures and Conformational Coverage with the LigPrep/ConfGen Suite: Comparison to Programs MOE and Catalyst. J. Chem. Inf. Model. 2010, 50, 822−839. (350) Brameld, K. A.; Kuhn, B.; Reuter, D. C.; Stahl, M. Small Molecule Conformational Preferences Derived from Crystal Structure Data. A Medicinal Chemistry Focused Analysis. J. Chem. Inf. Model. 2008, 48, 1−24. (351) Hawkins, P. C. D.; Nicholls, A. Conformer Generation with OMEGA: Learning from the Data Set and the Analysis of Failures. J. Chem. Inf. Model. 2012, 52, 2919−2936. (352) Liebeschuetz, J.; Hennemann, J.; Olsson, T.; Groom, C. R. The Good, the Bad and the Twisted: A Survey of Ligand Geometry in Protein Crystal Structures. J. Comput.-Aided Mol. Des. 2012, 26, 169− 183. (353) Groom, C. R.; Cole, J. C. The Use of Small Molecule Structures to Complement Protein-Ligand Crystal Structures in Drug Discovery. Acta Crystallogr. Sect. D Struct. Biol. 2017, 73, 240−245. (354) Zheng, Y.; Tice, C. M.; Singh, S. B. Conformational Control in Structure-Based Drug Design. Bioorg. Med. Chem. Lett. 2017, 27, 2825−2837. (355) Water−octanol partition coefficient for compounds that might be ionizable. (356) Ito, M.; Tanaka, T.; Toita, A.; Uchiyama, N.; Kokubo, H.; Morishita, N.; Klein, M. G.; Zou, H.; Murakami, M.; Kondo, M.; et al. Discovery of 3-Benzyl-1-(trans-4-((5-Cyanopyridin-2-yl)Amino)Cyclohexyl)-1-Arylurea Derivatives as Novel and Selective CyclinDependent Kinase 12 (CDK12) Inhibitors. J. Med. Chem. 2018, 61, 7710−7728. (357) Wuitschik, G.; Carreira, E. M.; Wagner, B.; Fischer, H.; Parrilla, I.; Schuler, F.; Rogers-Evans, M.; Müller, K. Oxetanes in Drug Discovery: Structural and Synthetic Insights. J. Med. Chem. 2010, 53, 3227−3246. (358) Kuhn, B.; Guba, W.; Hert, J.; Banner, D.; Bissantz, C.; Ceccarelli, S.; Haap, W.; Körner, M.; Kuglstatter, A.; Lerner, C.; et al. A Real-World Perspective on Molecular Design. J. Med. Chem. 2016, 59, 4087−4102. (359) Crowley, P. J.; Berry, E. A.; Cromartie, T.; Daldal, F.; Godfrey, C. R. A.; Lee, D. W.; Phillips, J. E.; Taylor, A.; Viner, R. The Role of Molecular Modeling in the Design of Analogues of the Fungicidal Natural Products Crocacins A and D. Bioorg. Med. Chem. 2008, 16, 10345−10355. (360) Tatum, N. J.; Liebeschuetz, J. W.; Cole, J. C.; Frita, R.; Herledan, A.; Baulard, A. R.; Willand, N.; Pohl, E. New Active Leads AS

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

for Tuberculosis Booster Drugs by Structure-Based Drug Discovery. Org. Biomol. Chem. 2017, 15, 10245−10255. (361) Furet, P.; Caravatti, G.; Guagnano, V.; Lang, M.; Meyer, T.; Schoepfer, J. Entry into a New Class of Protein Kinase Inhibitors by Pseudo Ring Design. Bioorg. Med. Chem. Lett. 2008, 18, 897−900. (362) Loeffler, J. R.; Ehmki, E. S. R.; Fuchs, J. E.; Liedl, K. R. Kinetic Barriers in the Isomerization of Substituted Ureas: Implications for Computer-Aided Drug Design. J. Comput.-Aided Mol. Des. 2016, 30, 391−400. (363) Halgren, T. A. Merck Molecular Force Field. V. Extension of MMFF94 Using Experimental Data, Additional Computational Data, and Empirical Rules. J. Comput. Chem. 1996, 17, 616−641. (364) Lupyan, D.; Abramov, Y. A.; Sherman, W. Close Intramolecular Sulfur-Oxygen Contacts: Modified Force Field Parameters for Improved Conformation Generation. J. Comput.-Aided Mol. Des. 2012, 26, 1195−1205. (365) Sun, H.; Jin, Z.; Yang, C.; Akkermans, R. L. C.; Robertson, S. H.; Spenley, N. A.; Miller, S.; Todd, S. M. COMPASS II: Extended Coverage for Polymer and Drug-like Molecule Databases. J. Mol. Model. 2016, 22, 47. (366) Vermaas, J. V.; Petridis, L.; Ralph, J.; Crowley, M. F.; Beckham, G. T. Systematic Parameterization of Lignin for the CHARMM Force Field. Green Chem. 2019, 21, 109−122. (367) Sellers, B. D.; James, N. C.; Gobbi, A. A Comparison of Quantum and Molecular Mechanical Methods to Estimate Strain Energy in Druglike Fragments. J. Chem. Inf. Model. 2017, 57, 1265− 1275. (368) Griewel, A.; Kayser, O.; Schlosser, J.; Rarey, M. Conformational Sampling for Large-Scale Virtual Screening: Accuracy versus Ensemble Size. J. Chem. Inf. Model. 2009, 49, 2303−2311. (369) Lagorce, D.; Pencheva, T.; Villoutreix, B. O.; Miteva, M. A. DG-AMMOS: A New Tool to Generate 3D Conformation of Small Molecules Using Distance Geometry and Automated Molecular Mechanics Optimization for in Silico Screening. BMC Chem. Biol. 2009, 9, 6. (370) Hawkins, P. C. D.; Skillman, A. G.; Warren, G. L.; Ellingson, B. A.; Stahl, M. T. Conformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural Database. J. Chem. Inf. Model. 2010, 50, 572−584. (371) Riniker, S.; Landrum, G. A. Better Informed Distance Geometry: Using What We Know to Improve Conformation Generation. J. Chem. Inf. Model. 2015, 55, 2562−2574. (372) Schärfer, C.; Schulz-Gasch, T.; Hert, J.; Heinzerling, L.; Schulz, B.; Inhester, T.; Stahl, M.; Rarey, M. CONFECT: Conformations from an Expert Collection of Torsion Patterns. ChemMedChem 2013, 8, 1690−1700. (373) Habgood, M. Bioactive Focus in Conformational Ensembles: A Pluralistic Approach. J. Comput.-Aided Mol. Des. 2017, 31, 1073− 1083. (374) Gardiner, E. J.; Cosgrove, D. A.; Taylor, R.; Gillet, V. J. Multiobjective Optimization of Pharmacophore Hypotheses: Bias toward Low-Energy Conformations. J. Chem. Inf. Model. 2009, 49, 2761−2773. (375) Murphy, R. B.; Repasky, M. P.; Greenwood, J. R.; TubertBrohman, I.; Jerome, S.; Annabhimoju, R.; Boyles, N. A.; Schmitz, C. D.; Abel, R.; Farid, R.; et al. WScore: A Flexible and Accurate Treatment of Explicit Water Molecules in Ligand-Receptor Docking. J. Med. Chem. 2016, 59, 4364−4384. (376) Bruno, I. J.; Cole, J. C.; Lommerse, J. P. M.; Rowland, R. S.; Taylor, R.; Verdonk, M. L. IsoStar: A Library of Information about Nonbonded Interactions. J. Comput.-Aided Mol. Des. 1997, 11, 525− 537. (377) Mouscadet, J. F.; Arora, R.; André, J.; Lambry, J. C.; Delelis, O.; Malet, I.; Marcelin, A. G.; Calvez, V.; Tchertanov, L. HIV-1 IN Alternative Molecular Recognition of DNA Induced by Raltegravir Resistance Mutations. J. Mol. Recognit. 2009, 22, 480−494.

(378) Verdonk, M. L.; Cole, J. C.; Taylor, R. SuperStar: A Knowledge-Based Approach for Identifying Interaction Sites in Proteins. J. Mol. Biol. 1999, 289, 1093−1108. (379) Nissink, J. W. M.; Taylor, R. Combined Use of Physicochemical Data and Small-Molecule Cristallographic Contact Propensities to Predict Interactions in Protein Binding Sites. Org. Biomol. Chem. 2004, 2, 3238−3249. (380) Ruf, S.; Buning, C.; Schreuder, H.; Horstick, G.; Linz, W.; Olpp, T.; Pernerstorfer, J.; Hiss, K.; Kroll, K.; Kannt, A.; et al. Novel β-Amino Acid Derivatives as Inhibitors of Cathepsin A. J. Med. Chem. 2012, 55, 7636−7649. (381) Schmidt, M. F.; Korb, O.; Howard, N. I.; Dias, M. V. B.; Blundell, T. L.; Abell, C. Discovery of Schaeffer’s Acid Analogues as Lead Structures of Mycobacterium Tuberculosis Type II Dehydroquinase Using a Rational Drug Design Approach. ChemMedChem 2013, 8, 54−58. (382) Ismail, M. A. H.; Abou El Ella, D. A.; Abouzid, K. A. M.; Mahmoud, A. H. Integrated Structure-Based Activity Prediction Model of Benzothiadiazines on Various Genotypes of HCV NS5b Polymerase (1a, 1b and 4) and Its Application in the Discovery of New Derivatives. Bioorg. Med. Chem. 2012, 20, 2455−2478. (383) Radoux, C. J.; Olsson, T. S. G.; Pitt, W. R.; Groom, C. R.; Blundell, T. L. Identifying Interactions That Determine Fragment Binding at Protein Hotspots. J. Med. Chem. 2016, 59, 4314−4325. (384) Roca, C.; Requena, C.; Sebastián-Pérez, V.; Malhotra, S.; Radoux, C.; Pérez, C.; Martinez, A.; Páez, J. A.; Blundell, T. L.; Campillo, N. E. Identification of New Allosteric Sites and Modulators of AChE through Computational and Experimental Tools. J. Enzyme Inhib. Med. Chem. 2018, 33, 1034−1047. (385) Brink, A.; Helliwell, J. R. New Leads for Fragment-Based Design of Rhenium/Technetium Radiopharmaceutical Agents. IUCrJ 2017, 4, 283−290. (386) Rossato, G.; Ernst, B.; Vedani, A.; Smieško, M. AcquaAlta: A Directional Approach to the Solvation of Ligand-Protein Complexes. J. Chem. Inf. Model. 2011, 51, 1867−1881. (387) Scott, J. S.; Birch, A. M.; Brocklehurst, K. J.; Broo, A.; Brown, H. S.; Butlin, R. J.; Clarke, D. S.; Davidsson, Ö .; Ertan, A.; Goldberg, K.; et al. Use of Small-Molecule Crystal Structures to Address Solubility in a Novel Series of G Protein Coupled Receptor 119 Agonists: Optimization of a Lead and in Vivo Evaluation. J. Med. Chem. 2012, 55, 5361−5379. (388) Boström, J.; Hogner, A.; Llinàs, A.; Wellner, E.; Plowright, A. T. Oxadiazoles in Medicinal Chemistry. J. Med. Chem. 2012, 55, 1817−1830. (389) Bender, A.; Jenkins, J. L.; Scheiber, J.; Sukuru, S. C. K.; Glick, M.; Davies, J. W. How Similar Are Similarity Searching Methods ? A Principal Component Analysis of Molecular Descriptor Space. J. Chem. Inf. Model. 2009, 49, 108−119. (390) Fry, D.; Huang, K. S.; DiLello, P.; Mohr, P.; Müller, K.; So, S. S.; Harada, T.; Stahl, M.; Vu, B.; Mauser, H. Design of Libraries Targeting Protein-Protein Interfaces. ChemMedChem 2013, 8, 726− 732. (391) Talamas, F. X.; Ao-Ieong, G.; Brameld, K. A.; Chin, E.; de Vicente, J.; Dunn, J. P.; Ghate, M.; Giannetti, A. M.; Harris, S. F.; Labadie, S. S.; et al. De Novo Fragment Design: A Medicinal Chemistry Approach to Fragment-Based Lead Generation. J. Med. Chem. 2013, 56, 3115−3119. (392) Hebeisen, P.; Haap, W.; Kuhn, B.; Mohr, P.; Wessel, H. P.; Zutter, U.; Kirchner, S.; Ruf, A.; Benz, J.; Joseph, C.; et al. Orally Active Aminopyridines as Inhibitors of Tetrameric Fructose-1,6Bisphosphatase. Bioorg. Med. Chem. Lett. 2011, 21, 3237−3242. (393) Sun, H.; Tawa, G.; Wallqvist, A. Classification of ScaffoldHopping Approaches. Drug Discovery Today 2012, 17, 310−324. (394) Groom, C. R.; Olsson, T. S. G.; Liebeschuetz, J. W.; Bardwell, D. A.; Bruno, I. J.; Allen, F. H. Mining the Cambridge Structural Database for Bioisosteres. In Bioisosteres in Medicinal Chemistry; Brown, N., Ed.; Wiley-VCH, 2012; pp 75−101. (395) Peffer, S. Fragments and conformations from the CCDC’s Cambridge Structural Database accessible through Cresset’s Spark. AT

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

https://www.cresset-group.com/2015/07/fragments-andconformations-from-the-ccdcs-cambridge-structural-databaseaccessible-through-cressets-spark/ (accessed Jan 22, 2019). (396) Chemical Computing Group. MOE Molecular Operating Environment: Structure-Based Design. https://www.chemcomp.com/ MOE-Structure_Based_Design.htm (accessed Jan 22, 2019). (397) Maass, P.; Schulz-Gasch, T.; Stahl, M.; Rarey, M. Recore: A Fast and Versatile Method for Scaffold Hopping Based on Small Molecule Crystal Structure Conformations. J. Chem. Inf. Model. 2007, 47, 390−399. (398) Grygorenko, O. O.; Babenko, P.; Volochnyuk, D. M.; Raievskyi, O.; Komarov, I. V. Following Ramachandran: Exit Vector Plots (EVP) as a Tool to Navigate Chemical Space Covered by 3D Bifunctional Scaffolds. The Case of Cycloalkanes. RSC Adv. 2016, 6, 17595−17605. (399) Grygorenko, O. O.; Demenko, D.; Volochnyuk, D. M.; Komarov, I. V. Following Ramachandran 2: Exit Vector Plot (EVP) Analysis of Disubstituted Saturated Rings. New J. Chem. 2018, 42, 8355−8365. (400) Korb, O.; Kuhn, B.; Hert, J.; Taylor, N.; Cole, J.; Groom, C.; Stahl, M. Interactive and Versatile Navigation of Structural Databases. J. Med. Chem. 2016, 59, 4257−4266. (401) Awale, M.; Jin, X.; Reymond, J. L. Stereoselective Virtual Screening of the ZINC Database Using Atom Pair 3D-Fingerprints. J. Cheminf. 2015, 7, 3. (402) Spackman, P. R.; Thomas, S. P.; Jayatilaka, D. High Throughput Profiling of Molecular Shapes in Crystals. Sci. Rep. 2016, 6, 22204. (403) Huang, L.-F.; Tong, W.-Q. T. Impact of Solid State Properties on Developability Assessment of Drug Candidates. Adv. Drug Delivery Rev. 2004, 56, 321−334. (404) Chemburkar, S. R.; Bauer, J.; Deming, K.; Spiwek, H.; Patel, K.; Morris, J.; Henry, R.; Spanton, S.; Dziki, W.; Porter, W.; et al. Dealing with the Impact of Ritonavir Polymorphs on the Late Stages of Bulk Drug Process Development. Org. Process Res. Dev. 2000, 4, 413−417. (405) Ticehurst, M.; Docherty, R. From Molecules to Pharmaceutical Products − The Drug Substance/Drug Product Interface. Am. Pharm. Rev. 2006, 9, 32−36. (406) Feeder, N.; Pidcock, E.; Reilly, A. M.; Sadiq, G.; Doherty, C. L.; Back, K. R.; Meenan, P.; Docherty, R. The Integration of SolidForm Informatics into Solid-Form Selection. J. Pharm. Pharmacol. 2015, 67, 857−868. (407) Pfizer backs UK pharma materials institute. https://www. thepharmaletter.com/article/pfizer-backs-uk-pharma-materialsinstitute (accessed Jan 22, 2019). (408) Macrae, C. F.; Bruno, I. J.; Chisholm, J. A.; Edgington, P. R.; McCabe, P.; Pidcock, E.; Rodriguez-Monge, L.; Taylor, R.; van de Streek, J.; Wood, P. A. Mercury CSD 2.0 − New Features for the Visualization and Investigation of Crystal Structures. J. Appl. Crystallogr. 2008, 41, 466−470. (409) Chisholm, J. A.; Motherwell, S. A New Algorithm for Performing Three-Dimensional Searches of the Cambridge Structural Database. J. Appl. Crystallogr. 2004, 37, 331−334. (410) Chisholm, J. A.; Motherwell, S. COMPACK: A Program for Identifying Crystal Structure Similarity Using Distances. J. Appl. Crystallogr. 2005, 38, 228−231. (411) Haynes, D. A.; Chisholm, J. A.; Jones, W.; Motherwell, W. D. S. Supramolecular Synthon Competition in Organic Sulfonates: A CSD Survey. CrystEngComm 2004, 6, 584−588. (412) Fábián, L.; Chisholm, J. A.; Galek, P. T. A.; Motherwell, W. D. S.; Feeder, N. Hydrogen-Bond Motifs in the Crystals of Hydrophobic Amino Acids. Acta Crystallogr., Sect. B: Struct. Sci. 2008, 64, 504−514. (413) Pogoda, D.; Janczak, J.; Videnova-Adrabinska, V. New Polymorphs of an Old Drug: Conformational and Synthon Polymorphism of 5-Nitrofurazone. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2016, 72, 263−273. (414) The ensemble of possible structures found in a crystal structure prediction exercise.

(415) Johnston, A.; Bardin, J.; Johnston, B. F.; Fernandes, P.; Kennedy, A. R.; Price, S. L.; Florence, A. J. Experimental and Predicted Crystal Energy Landscapes of Chlorothiazide. Cryst. Growth Des. 2011, 11, 405−413. (416) Maloney, A. G. P.; Wood, P. A.; Parsons, S. Competition between Hydrogen Bonding and Dispersion Interactions in the Crystal Structures of the Primary Amines. CrystEngComm 2014, 16, 3867−3882. (417) Day, G. M.; Motherwell, W. D. S.; Ammon, H. L.; Boerrigter, S. X. M.; Della Valle, R. G.; Venuti, E.; Dzyabchenko, A.; Dunitz, J. D.; Schweizer, B.; van Eijck, B. P.; et al. A Third Blind Test of Crystal Structure Prediction. Acta Crystallogr., Sect. B: Struct. Sci. 2005, 61, 511−527. (418) Childs, S. L.; Wood, P. A.; Rodriguez-Hornedo, N.; Reddy, L. S.; Hardcastle, K. I. Analysis of 50 Crystal Structures Containing Carbamazepine Using the Materials Module of Mercury CSD. Cryst. Growth Des. 2009, 9, 1869−1888. (419) Kennedy, A. R.; Morrison, C. A.; Briggs, N. E. B.; Arbuckle, W. Density and Stability Differences Between Enantiopure and Racemic Salts: Construction and Structural Analysis of a Systematic Series of Crystalline Salt Forms of Methylephedrine. Cryst. Growth Des. 2011, 11, 1821−1834. (420) Briggs, N. E. B.; Kennedy, A. R.; Morrison, C. A. 42 Salt Forms of Tyramine: Structural Comparison and the Occurrence of Hydrate Formation. Acta Crystallogr., Sect. B: Struct. Sci. 2012, 68, 453−464. (421) Tumanov, N. A.; Myz, S. A.; Shakhtshneider, T. P.; Boldyreva, E. V. Are Meloxicam Dimers Really the Structure-Forming Units in the ‘Meloxicam−Carboxylic Acid’ Co-Crystals Family? Relation between Crystal Structures and Dissolution Behaviour. CrystEngComm 2012, 14, 305−313. (422) Sládková, V.; Skalická, T.; Skořepová, E.; Č ejka, J.; Eigner, V.; Kratochvíl, B. Systematic Solvate Screening of Trospium Chloride: Discovering Hydrates of a Long-Established Pharmaceutical. CrystEngComm 2015, 17, 4712−4721. (423) Wanat, M.; Malinska, M.; Kutner, A.; Wozniak, K. Effect of Vitamin D Conformation on Interactions and Packing in the Crystal Lattice. Cryst. Growth Des. 2018, 18, 3385−3396. (424) Wood, P. A.; Oliveira, M. A.; Zink, A.; Hickey, M. B. Isostructurality in Pharmaceutical Salts: How Often and How Similar? CrystEngComm 2012, 14, 2413−2421. (425) The Crystal Form Consortium - The Cambridge Crystallographic Data Centre (CCDC). https://www.ccdc.cam.ac.uk/ Community/crystalformconsortium/ (accessed Jan 22, 2019). (426) Rietveld, I. B.; Céolin, R. Rotigotine: Unexpected Polymorphism with Predictable Overall Monotropic Behavior. J. Pharm. Sci. 2015, 104, 4117−4122. (427) Wood, P. A.; Olsson, T. S. G.; Cole, J. C.; Cottrell, S. J.; Feeder, N.; Galek, P. T. A.; Groom, C. R.; Pidcock, E. Evaluation of Molecular Crystal Structures Using Full Interaction Maps. CrystEngComm 2013, 15, 65−72. (428) Galek, P. T. A.; Allen, F. H.; Fábián, L.; Feeder, N. Knowledge-Based H-Bond Prediction to Aid Experimental Polymorph Screening. CrystEngComm 2009, 11, 2634−2639. (429) Galek, P. T. A.; Fábián, L.; Allen, F. H. Truly Prospective Prediction: Inter- and Intramolecular Hydrogen Bonding. CrystEngComm 2010, 12, 2091−2099. (430) Galek, P. T. A.; Pidcock, E.; Wood, P. A.; Bruno, I. J.; Groom, C. R. One in Half a Million: A Solid Form Informatics Study of a Pharmaceutical Crystal Structure. CrystEngComm 2012, 14, 2391− 2403. (431) Abramov, Y. A. Current Computational Approaches to Support Pharmaceutical Solid Form Selection. Org. Process Res. Dev. 2013, 17, 472−485. (432) Ismail, S. Z.; Anderton, C. L.; Copley, R. C. B.; Price, L. S.; Price, S. L. Evaluating a Crystal Energy Landscape in the Context of Industrial Polymorph Screening. Cryst. Growth Des. 2013, 13, 2396− 2406. AU

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(433) Price, L. S.; McMahon, J. A.; Lingireddy, S. R.; Lau, S.-F.; Diseroad, B. A.; Price, S. L.; Reutzel-Edens, S. M. A Molecular Picture of the Problems in Ensuring Structural Purity of Tazofelone. J. Mol. Struct. 2014, 1078, 26−42. (434) Nauha, E.; Bernstein, J. “Predicting” Crystal Forms of Pharmaceuticals Using Hydrogen Bond Propensities: Two Test Cases. Cryst. Growth Des. 2014, 14, 4364−4370. (435) Nauha, E.; Bernstein, J. “Predicting” Polymorphs of Pharmaceuticals Using Hydrogen Bond Propensities: Probenecid and Its Two Single-Crystal-to-Single-Crystal Phase Transitions. J. Pharm. Sci. 2015, 104, 2056−2061. (436) Etter, M. C. Encoding and Decoding Hydrogen-Bond Patterns of Organic Compounds. Acc. Chem. Res. 1990, 23, 120−126. (437) Aakeröy, C. B.; Salmon, D. J. Building Co-Crystals with Molecular Sense and Supramolecular Sensibility. CrystEngComm 2005, 7, 439−448. (438) Almarsson, Ö .; Zaworotko, M. J. Crystal Engineering of the Composition of Pharmaceutical Phases. Do Pharmaceutical CoCrystals Represent a New Path to Improved Medicines? Chem. Commun. 2004, 1889−1896. (439) Fábián, L. Cambridge Structural Database Analysis of Molecular Complementarity in Cocrystals. Cryst. Growth Des. 2009, 9, 1436−1443. (440) Karki, S.; Frišcǐ ć, T.; Fábián, L.; Jones, W. New Solid Forms of Artemisinin Obtained through Cocrystallisation. CrystEngComm 2010, 12, 4038−4041. (441) Pallipurath, A. R.; Civati, F.; Eziashi, M.; Omar, E.; McArdle, P.; Erxleben, A. Tailoring Cocrystal and Salt Formation and Controlling the Crystal Habit of Diflunisal. Cryst. Growth Des. 2016, 16, 6468−6478. (442) Karki, S.; Frišcǐ ć, T.; Fábián, L.; Laity, P. R.; Day, G. M.; Jones, W. Improving Mechanical Properties of Crystalline Solids by Cocrystal Formation: New Compressible Forms of Paracetamol. Adv. Mater. 2009, 21, 3905−3909. (443) Wood, P. A.; Feeder, N.; Furlow, M.; Galek, P. T. A.; Groom, C. R.; Pidcock, E. Knowledge-Based Approaches to Co-Crystal Design. CrystEngComm 2014, 16, 5839−5848. (444) Oswald, I. D. H.; Motherwell, W. D. S.; Parsons, S.; Pidcock, E.; Pulham, C. R. Rationalisation of Co-Crystal Formation through Knowledge-Mining. Crystallogr. Rev. 2004, 10, 57−66. (445) Bruno, I. J.; Cole, J. C.; Edgington, P. R.; Kessler, M.; Macrae, C. F.; McCabe, P.; Pearson, J.; Taylor, R. New Software for Searching the Cambridge Structural Database and Visualising Crystal Structures. Acta Crystallogr., Sect. B: Struct. Sci. 2002, B58, 389−397. (446) Wang, J.-R.; Ye, C.; Mei, X. Structural and Physicochemical Aspects of Hydrochlorothiazide Co-Crystals. CrystEngComm 2014, 16, 6996−7003. (447) Mapp, L. K.; Coles, S. J.; Aitipamula, S. Novel Solid Forms of Lonidamine: Crystal Structures and Physicochemical Properties. CrystEngComm 2017, 19, 2925−2935. (448) Mapp, L. K.; Coles, S. J.; Aitipamula, S. Design of Cocrystals for Molecules with Limited Hydrogen Bonding Functionalities: Propyphenazone as a Model System. Cryst. Growth Des. 2017, 17, 163−174. (449) Delori, A.; Galek, P. T. A.; Pidcock, E.; Jones, W. Quantifying Homo- and Heteromolecular Hydrogen Bonds as a Guide for Adduct Formation. Chem. - Eur. J. 2012, 18, 6835−6846. (450) Delori, A.; Galek, P. T. A.; Pidcock, E.; Patni, M.; Jones, W. Knowledge-Based Hydrogen Bond Prediction and the Synthesis of Salts and Cocrystals of the Anti-Malarial Drug Pyrimethamine with Various Drug and GRAS Molecules. CrystEngComm 2013, 15, 2916− 2928. (451) Sandhu, B.; McLean, A.; Sinha, A. S.; Desper, J.; Sarjeant, A. A.; Vyas, S.; Reutzel-Edens, S. M.; Aakeröy, C. B. Evaluating Competing Intermolecular Interactions through Molecular Electrostatic Potentials and Hydrogen-Bond Propensities. Cryst. Growth Des. 2018, 18, 466−478.

(452) Delori, A.; Suresh, E.; Pedireddi, V. R. Influence of Molecular Shape on the Design and Synthesis of Supramolecular Assemblies. CrystEngComm 2013, 15, 4811−4815. (453) Sarma, B.; Saikia, B. Hydrogen Bond Synthon Competition in the Stabilization of Theophylline Cocrystals. CrystEngComm 2014, 16, 4753−4765. (454) Manin, A. N.; Drozd, K. V.; Churakov, A. V.; Perlovich, G. L. Hydrogen Bond Donor/Acceptor Ratios of the Coformers: Do They Really Matter for the Prediction of Molecular Packing in Cocrystals? The Case of Benzamide Derivatives with Dicarboxylic Acids. Cryst. Growth Des. 2018, 18, 5254−5269. (455) Aitipamula, S.; Chow, P. S.; Tan, R. B. H. Polymorphs and Solvates of a Cocrystal Involving an Analgesic Drug, Ethenzamide, and 3,5-Dinitrobenzoic Acid. Cryst. Growth Des. 2010, 10, 2229− 2238. (456) Mnguni, M. J.; Michael, J. P.; Lemmerer, A. Binary Polymorphic Cocrystals: An Update on the Available Literature in the Cambridge Structural Database, Including a New Polymorph of the Pharmaceutical 1:1 Cocrystal Theophylline−3, 4-Dihydroxybenzoic Acid. Acta Crystallogr., Sect. C: Struct. Chem. 2018, 74, 715−720. (457) Gonnade, R. G.; Sangtani, E. Polymorphs and Cocrystals: A Comparative Analysis. J. Indian Inst. Sci. 2017, 97, 193−226. (458) Donnay, J. D. H.; Harker, D. A New Law of Crystal Morphology Extending the Law of Bravias. Am. Mineral. 1937, 22, 446−467. (459) Mugheirbi, N. A.; Tajber, L. Crystal Habits of Itraconazole Microcrystals: Unusual Isomorphic Intergrowths Induced via Tuning Recrystallization Conditions. Mol. Pharmaceutics 2015, 12, 3468− 3478. (460) Serrano, D. R.; Mugheirbi, N. A.; O’Connell, P.; Leddy, N.; Healy, A. M.; Tajber, L. Impact of Substrate Properties on the Formation of Spherulitic Films: A Case Study of Salbutamol Sulfate. Cryst. Growth Des. 2016, 16, 3853−3858. (461) Rosbottom, I.; Ma, C. Y.; Turner, T. D.; O’Connell, R. A.; Loughrey, J.; Sadiq, G.; Davey, R. J.; Roberts, K. J. Influence of Solvent Composition on the Crystal Morphology and Structure of PAminobenzoic Acid Crystallized from Mixed Ethanol and Nitromethane Solutions. Cryst. Growth Des. 2017, 17, 4151−4161. (462) Cambridge Crystallographic Data Centre. CSD Python API. https://www.ccdc.cam.ac.uk/solutions/csd-system/components/csdpython-api/ (accessed Jan 22, 2019). (463) Bryant, M. J.; Maloney, A. G. P.; Sykes, R. A. Predicting Mechanical Properties of Crystalline Materials through Topological Analysis. CrystEngComm 2018, 20, 2698−2704. (464) Rama Krishna, G.; Ukrainczyk, M.; Zeglinski, J.; Rasmuson, Å. C. Prediction of Solid State Properties of Cocrystals Using Artificial Neural Network Modeling. Cryst. Growth Des. 2018, 18, 133−144. (465) Docherty, R.; Pencheva, K.; Abramov, Y. A. Low Solubility in Drug Development: De-Convoluting the Relative Importance of Solvation and Crystal Packing. J. Pharm. Pharmacol. 2015, 67, 847− 856. (466) Marchese Robinson, R. L.; Roberts, K. J.; Martin, E. B. The Influence of Solid State Information and Descriptor Selection on Statistical Models of Temperature Dependent Aqueous Solubility. J. Cheminf. 2018, 10, 44. (467) Millar, D. I. A.; Marshall, W. G.; Oswald, I. D. H.; Pulham, C. R. High-Pressure Structural Studies of Energetic Materials. Crystallogr. Rev. 2010, 16, 115−132. (468) Aakeröy, C. B.; Wijethunga, T. K.; Desper, J. Crystal Engineering of Energetic Materials: Co-Crystals of Ethylenedinitramine (EDNA) with Modified Performance and Improved Chemical Stability. Chem. - Eur. J. 2015, 21, 11029−11037. (469) Zhang, J.; Zhang, Q.; Vo, T. T.; Parrish, D. A.; Shreeve, J. M. Energetic Salts with π-Stacking and Hydrogen-Bonding Interactions Lead the Way to Future Energetic Materials. J. Am. Chem. Soc. 2015, 137, 1697−1704. (470) Yeager, J. D.; Higginbotham Duque, A. L.; Shorty, M.; Bowden, P. R.; Stull, J. A. Development of Inert Density Mock Materials for HMX. J. Energ. Mater. 2018, 36, 253−265. AV

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(471) Landenberger, K. B.; Bolton, O.; Matzger, A. J. Energetic− Energetic Cocrystals of Diacetone Diperoxide (DADP): Dramatic and Divergent Sensitivity Modifications via Cocrystallization. J. Am. Chem. Soc. 2015, 137, 5074−5079. (472) Zhang, J.; Shreeve, J. M. Time for Pairing: Cocrystals as Advanced Energetic Materials. CrystEngComm 2016, 18, 6124−6133. (473) Sodkhomkhum, R.; Masik, M.; Watchasit, S.; Suksai, C.; Boonmak, J.; Youngme, S.; Wanichacheva, N.; Ervithayasuporn, V. Imidazolylmethylpyrene Sensor for Dual Optical Detection of Explosive Chemical: 2,4,6-Trinitrophenol. Sens. Actuators, B 2017, 245, 665−673. (474) Low, K. S.; Cole, J. M.; Zhou, X.; Yufa, N. Rationalizing the Molecular Origins of Ru- and Fe-Based Dyes for Dye-Sensitized Solar Cells. Acta Crystallogr., Sect. B: Struct. Sci. 2012, 68, 137−149. (475) Cole, J. M.; Low, K. S.; Ozoe, H.; Stathi, P.; Kitamura, C.; Kurata, H.; Rudolf, P.; Kawase, T. Data Mining with Molecular Design Rules Identifies New Class of Dyes for Dye-Sensitised Solar Cells. Phys. Chem. Chem. Phys. 2014, 16, 26684−26690. (476) Veits, G. K.; Carter, K. K.; Cox, S. J.; McNeil, A. J. Developing a Gel-Based Sensor Using Crystal Morphology Prediction. J. Am. Chem. Soc. 2016, 138, 12228−12233. (477) Schober, C.; Reuter, K.; Oberhofer, H. Virtual Screening for High Carrier Mobility in Organic Semiconductors. J. Phys. Chem. Lett. 2016, 7, 3973−3977. (478) Kunkel, C.; Schober, C.; Margraf, J. T.; Reuter, K.; Oberhofer, H. Finding the Right Bricks for Molecular Legos: A Data Mining Approach to Organic Semiconductor Design. Chem. Mater. 2019, 31, 969−978. (479) Cole, J. M.; Kreiling, S. Exploiting Structure/Property Relationships in Organic Non-Linear Optical Materials: Developing Strategies to Realize the Potential of TCNQ Derivatives. CrystEngComm 2002, 4, 232−238. (480) Wojnarska, J.; Gryl, M.; Seidler, T.; Stadnicka, K. M. Crystal Engineering, Optical Properties and Electron Density Distribution of Polar Multicomponent Materials Containing Sulfanilamide. CrystEngComm 2018, 20, 3638−3646. (481) Shi, P.-P.; Tang, Y.-Y.; Li, P.-F.; Liao, W.-Q.; Wang, Z.-X.; Ye, Q.; Xiong, R.-G. Symmetry Breaking in Molecular Ferroelectrics. Chem. Soc. Rev. 2016, 45, 3811−3827. (482) Horiuchi, S.; Kumai, R.; Tokura, Y. Hydrogen-Bonding Molecular Chains for High-Temperature Ferroelectricity. Adv. Mater. 2011, 23, 2098−2103. (483) Horiuchi, S.; Kagawa, F.; Hatahara, K.; Kobayashi, K.; Kumai, R.; Murakami, Y.; Tokura, Y. Above-Room-Temperature Ferroelectricity and Antiferroelectricity in Benzimidazoles. Nat. Commun. 2012, 3, 1308. (484) Owczarek, M.; Hujsak, K. A.; Ferris, D. P.; Prokofjevs, A.; Majerz, I.; Szklarz, P.; Zhang, H.; Sarjeant, A. A.; Stern, C. L.; Jakubas, R.; et al. Flexible Ferroelectric Organic Crystals. Nat. Commun. 2016, 7, 13108. (485) Tayi, A. S.; Kaeser, A.; Matsumoto, M.; Aida, T.; Stupp, S. I. Supramolecular Ferroelectrics. Nat. Chem. 2015, 7, 281−294. (486) Gómez-Coca, S.; Cremades, E.; Aliaga-Alcalde, N.; Ruiz, E. Huge Magnetic Anisotropy in a Trigonal-Pyramidal Nickel(II) Complex. Inorg. Chem. 2014, 53, 676−678. (487) Müller, T. E.; Mingos, D. M. P. Determination of the Tolman Cone Angle from Crystallographic Parameters and a Statistical Analysis Using the Crystallographic Database. Transition Met. Chem. 1995, 20, 533−539. (488) Smith, J. M.; Taverner, B. C.; Coville, N. J. Cone Angle Radial Profiles. J. Organomet. Chem. 1997, 530, 131−140. (489) Dierkes, P.; van Leeuwen, P. W. N. M. The Bite Angle Makes the Difference: A Practical Ligand Parameter for Diphosphine Ligands. J. Chem. Soc., Dalton Trans. 1999, 1519−1529. (490) Novikov, R.; Bernardinelli, G.; Lacour, J. Enantioselective Olefin Epoxidation Using Axially Chiral Biaryl Azepinium Salts as Catalysts. Rapid in-Situ Screening and Origin of the Stereocontrol. Adv. Synth. Catal. 2008, 350, 1113−1124.

(491) Kulik, H. J.; Wong, S. E.; Baker, S. E.; Valdez, C. A.; Satcher, J. H., Jr.; Aines, R. D.; Lightstone, F. C. Developing an Approach for First-Principles Catalyst Design: Application to Carbon-Capture Catalysis. Acta Crystallogr., Sect. C: Struct. Chem. 2014, 70, 123−131. (492) Sattler, A.; Parkin, G. Cleaving Carbon-Carbon Bonds by Inserting Tungsten into Unstrained Aromatic Rings. Nature 2010, 463, 523−526. (493) Li, H.; Eddaoudi, M.; O’Keeffe, M.; Yaghi, O. M. Design and Synthesis of an Exceptionally Stable and Highly Porous MetalOrganic Framework. Nature 1999, 402, 276−279. (494) Long, J. R.; Yaghi, O. M. The Pervasive Chemistry of Metal− Organic Frameworks. Chem. Soc. Rev. 2009, 38, 1213−1214. (495) Farha, O. K.; Hupp, J. T. Rational Design, Synthesis, Purification, and Activation of Metal−Organic Framework Materials. Acc. Chem. Res. 2010, 43, 1166−1175. (496) Goldsmith, J.; Wong-Foy, A. G.; Cafarella, M. J.; Siegel, D. J. Theoretical Limits of Hydrogen Storage in Metal−Organic Frameworks: Opportunities and Trade-Offs. Chem. Mater. 2013, 25, 3373− 3382. (497) Chung, Y. G.; Camp, J.; Haranczyk, M.; Sikora, B. J.; Bury, W.; Krungleviciute, V.; Yildirim, T.; Farha, O. K.; Sholl, D. S.; Snurr, R. Q. Computation-Ready, Experimental Metal−Organic Frameworks: A Tool To Enable High-Throughput Screening of Nanoporous Crystals. Chem. Mater. 2014, 26, 6185−6192. (498) Nazarian, D.; Camp, J. S.; Sholl, D. S. A Comprehensive Set of High-Quality Point Charges for Simulations of Metal − Organic Frameworks. Chem. Mater. 2016, 28, 785−793. (499) Moghadam, P. Z.; Li, A.; Wiggin, S. B.; Tao, A.; Maloney, A. G. P.; Wood, P. A.; Ward, S. C.; Fairen-Jimenez, D. Development of a Cambridge Structural Database Subset: A Collection of Metal− Organic Frameworks for Past, Present, and Future. Chem. Mater. 2017, 29, 2618−2625. (500) Altintas, C.; Erucar, I.; Keskin, S. High-Throughput Computational Screening of the Metal Organic Framework Database for CH4/H2 Separations. ACS Appl. Mater. Interfaces 2018, 10, 3668− 3679. (501) Moghadam, P. Z.; Islamoglu, T.; Goswami, S.; Exley, J.; Fantham, M.; Kaminski, C. F.; Snurr, R. Q.; Farha, O. K.; FairenJimenez, D. Computer-Aided Discovery of a Metal-Organic Framework with Superior Oxygen Uptake. Nat. Commun. 2018, 9, 1378. (502) Park, S.; Kim, B.; Choi, S.; Boyd, P. G.; Smit, B.; Kim, J. Text Mining Metal−Organic Framework Papers. J. Chem. Inf. Model. 2018, 58, 244−251. (503) Inokuma, Y.; Matsumura, K.; Yoshioka, S.; Fujita, M. Finding a New Crystalline Sponge from a Crystallographic Database. Chem. Asian J. 2017, 12, 208−211. (504) Miklitz, M.; Jelfs, K. E. pywindow : Automated Structural Analysis of Molecular Pores. J. Chem. Inf. Model. 2018, 58, 2387− 2391. (505) Seiki, N.; Shoji, Y.; Kajitani, T.; Ishiwari, F.; Kosaka, A.; Hikima, T.; Takata, M.; Someya, T.; Fukushima, T. Rational Synthesis of Organic Thin Films with Exceptional Long-Range Structural Integrity. Science 2015, 348, 1122−1126. (506) Zolotarev, P. N.; Moret, M.; Rizzato, S.; Proserpio, D. M. Searching New Crystalline Substrates for OMBE: Topological and Energetic Aspects of Cleavable Organic Crystals. Cryst. Growth Des. 2016, 16, 1572−1582. (507) Liu, Y.; Grossman, J. C. Accelerating the Design of Solar Thermal Fuel Materials through High Throughput Simulations. Nano Lett. 2014, 14, 7046−7050. (508) Kratzert, D.; Holstein, J. J.; Krossing, I. DSR : Enhanced Modelling and Refinement of Disordered Structures with SHELXL. J. Appl. Crystallogr. 2015, 48, 933−938. (509) Holmes, S. T.; Wang, W. D.; Hou, G.; Dybowski, C.; Wang, W.; Bai, S. A New NMR Crystallographic Approach to Reveal the Calcium Local Structure of Atorvastatin Calcium. Phys. Chem. Chem. Phys. 2019, 21, 6319−6326. (510) Adams, P. D.; Aertgeerts, K.; Bauer, C.; Bell, J. A.; Berman, H. M.; Bhat, T. N.; Blaney, J. M.; Bolton, E.; Bricogne, G.; Brown, D.; AW

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

et al. Outcome of the First wwPDB/CCDC/D3R Ligand Validation Workshop. Structure 2016, 24, 502−508. (511) Gore, S.; Sanz García, E.; Hendrickx, P. M. S.; Gutmanas, A.; Westbrook, J. D.; Yang, H.; Feng, Z.; Baskaran, K.; Berrisford, J. M.; Hudson, B. P.; et al. Validation of Structures in the Protein Data Bank. Structure 2017, 25, 1916−1927. (512) Young, J. Y.; Westbrook, J. D.; Feng, Z.; Sala, R.; Peisach, E.; Oldfield, T. J.; Sen, S.; Gutmanas, A.; Armstrong, D. R.; Berrisford, J. M.; et al. OneDep: Unified wwPDB System for Deposition, Biocuration, and Validation of Macromolecular Structures in the PDB Archive. Structure 2017, 25, 536−545. (513) Shabalin, I. G.; Porebski, P. J.; Minor, W. Refining the Macromolecular Model − Achieving the Best Agreement with the Data from X-Ray Diffraction Experiment. Crystallogr. Rev. 2018, 24, 236−262. (514) Touw, W. G.; Joosten, R. P.; Vriend, G. New Biological Insights from Better Structure Models. J. Mol. Biol. 2016, 428, 1375− 1393. (515) Shao, C.; Yang, H.; Westbrook, J. D.; Young, J. Y.; Zardecki, C.; Burley, S. K. Multivariate Analyses of Quality Metrics for Crystal Structures in the PDB Archive. Structure 2017, 25, 458−468. (516) Smart, O. S.; Horský, V.; Gore, S.; Svobodová Vařeková, R.; Bendová, V.; Kleywegt, G. J.; Velankar, S. Validation of Ligands in Macromolecular Structures Determined by X-Ray Crystallography. Acta Crystallogr. Sect. D Struct. Biol. 2018, 74, 228−236. (517) Brucet, M.; Querol-Audí, J.; Bertlik, K.; Lloberas, J.; Fita, I.; Celada, A. Structural and Biochemical Studies of TREX1 Inhibition by Metals. Identification of a New Active Histidine Conserved in DEDDh Exonucleases. Protein Sci. 2008, 17, 2059−2069. (518) Harding, M. M.; Nowicki, M. W.; Walkinshaw, M. D. Metals in Protein Structures: A Review of Their Principal Features. Crystallogr. Rev. 2010, 16, 247−302. (519) Sagatova, A. A.; Keniya, M. V.; Wilson, R. K.; Monk, B. C.; Tyndall, J. D. A. Structural Insights into Binding of the Antifungal Drug Fluconazole to Saccharomyces Cerevisiae Lanosterol 14αDemethylase. Antimicrob. Agents Chemother. 2015, 59, 4982−4989. (520) Leonarski, F.; D’Ascenzo, L.; Auffinger, P. Binding of Metals to Purine N7 Nitrogen Atoms and Implications for Nucleic Acids: A CSD Survey. Inorg. Chim. Acta 2016, 452, 82−89. (521) Gudmundsson, M.; Kim, S.; Wu, M.; Ishida, T.; Momeni, M. H.; Vaaje-Kolstad, G.; Lundberg, D.; Royant, A.; Ståhlberg, J.; Eijsink, V. G. H.; et al. Structural and Electronic Snapshots during the Transition from a Cu(II) to Cu(I) Metal Center of a Lytic Polysaccharide Monooxygenase by X-Ray Photoreduction. J. Biol. Chem. 2014, 289, 18782−18792. (522) Moriarty, N. W.; Adams, P. D. Iron − Sulfur Clusters Have No Right Angles. Acta Crystallogr. Sect. D Biol. Crystallogr. 2019, 75, 16−20. (523) Fisher, S. J.; Helliwell, J. R. An Investigation into Structural Changes Due to Deuteration. Acta Crystallogr., Sect. A: Found. Crystallogr. 2008, 64, 359−367. (524) Kabova, E. A.; Blundell, C. D.; Shankland, K. Pushing the Limits of Molecular Crystal Structure Determination From Powder Diffraction Data in High-Throughput Chemical Environments. J. Pharm. Sci. 2018, 107, 2042−2047. (525) Cole, J. C.; Kabova, E. A.; Shankland, K. Utilizing Organic and Organometallic Structural Data in Powder Diffraction. Powder Diffr. 2014, 29, S19−S30. (526) Shankland, K.; Spillman, M. J.; Kabova, E. A.; Edgeley, D. S.; Shankland, N. The Principles Underlying the Use of Powder Diffraction Data in Solving Pharmaceutical Crystal Structures. Acta Crystallogr., Sect. C: Cryst. Struct. Commun. 2013, 69, 1251−1259. (527) Hughes, C. E.; Reddy, G. N. M.; Masiero, S.; Brown, S. P.; Williams, P. A.; Harris, K. D. M. Determination of a Complex Crystal Structure in the Absence of Single Crystals: Analysis of Powder X-Ray Diffraction Data, Guided by Solid-State NMR and Periodic DFT Calculations, Reveals a New 2′-Deoxyguanosine Structural Motif. Chem. Sci. 2017, 8, 3971−3979.

(528) Ghouili, A.; Rohlicek, J.; Ayed, T. B.; Hassen, R. B. Crystal Structure Determination from Powder Diffraction Data of the Coumarin Vanillin Chalcone. Powder Diffr. 2014, 29, 361−365. (529) Kabova, E. A.; Cole, J. C.; Korb, O.; Williams, A. C.; Shankland, K. Improved Crystal Structure Solution from Powder Diffraction Data by the Use of Conformational Information. J. Appl. Crystallogr. 2017, 50, 1421−1427. (530) Fernandes, J. A.; Abosede, O.; Galli, S. Powder X-Ray Diffraction Structural Characterization of the Coordination Complex Cis-[Co(κ2 N,N′-1,10-Phenanthroline-5,6-Dione)2Cl2]. Powder Diffr. 2018, 33, 55−61. (531) Ibiapino, A. L.; Seiceira, R. C.; Pitaluga, A., Jr.; Trindade, A. C.; Ferreira, F. F. Structural Characterization of Form I of Anhydrous Rifampicin. CrystEngComm 2014, 16, 8555−8562. (532) Reymond, J. The Chemical Space Project. Acc. Chem. Res. 2015, 48, 722−730. (533) International Union of Crystallography. IUCrData. https:// iucrdata.iucr.org/x/index.html (accessed Jan 23, 2019). (534) Cambridge Crystallographic Data Centre. CSD Communications. https://www.ccdc.cam.ac.uk/Community/depositastructure/ CSDCommunications/ (accessed Jan 23, 2019). (535) Noor, A.; Bauer, T.; Todorova, T. K.; Weber, B.; Gagliardi, L.; Kempe, R. The Ligand-Based Quintuple Bond-Shortening Concept and Some of Its Limitations. Chem. - Eur. J. 2013, 19, 9825−9832. (536) Hess, C. R.; Weyhermüller, T.; Bill, E.; Wieghardt, K. [{Fe(tim)}2]: An Fe-Fe Dimer Containing an Unsupported MetalMetal Bond and Redox-Active N4 Macrocyclic Ligands. Angew. Chem., Int. Ed. 2009, 48, 3703−3706. (537) Reeves, M.; Wood, P.; Parsons, S. Assigning Transition Metal Oxidation States to Entries in the Cambridge Structural Database. Acta Crystallogr., Sect. A: Found. Adv. 2018, 74, No. e154. (538) Dunitz, J. D.; Gavezzotti, A. Molecular Recognition in Organic Crystals: Directed Intermolecular Bonds or Nonlocalized Bonding? Angew. Chem., Int. Ed. 2005, 44, 1766−1787. (539) Spek, A. L. PLATON SQUEEZE: A Tool for the Calculation of the Disordered Solvent Contribution to the Calculated Structure Factors. Acta Crystallogr., Sect. C: Struct. Chem. 2015, 71, 9−18. (540) Bruno, I. J.; Shields, G. P.; Taylor, R. Deducing Chemical Structure from Crystallographically Determined Atomic Coordinates. Acta Crystallogr., Sect. B: Struct. Sci. 2011, 67, 333−349. (541) ACD/Labs. ACD/Name. https://www.acdlabs.com/ products/draw_nom/nom/name/ (accessed Jan 23, 2019). (542) Sykes, R. A.; McCabe, P.; Allen, F. H.; Battle, G. M.; Bruno, I. J.; Wood, P. A. New Software for Statistical Analysis of Cambridge Structural Database Data. J. Appl. Crystallogr. 2011, 44, 882−886. (543) Thomas, I. R.; Bruno, I. J.; Cole, J. C.; Macrae, C. F.; Pidcock, E.; Wood, P. A. WebCSD: The Online Portal to the Cambridge Structural Database. J. Appl. Crystallogr. 2010, 43, 362−366. (544) Bruno, I. J.; Groom, C. R. A Crystallographic Perspective on Sharing Data and Knowledge. J. Comput.-Aided Mol. Des. 2014, 28, 1015−1022. (545) Hall, S. R.; Allen, F. H.; Brown, I. D. The Crystallographic Information File (CIF): A New Standard Archive File for Crystallography. Acta Crystallogr., Sect. A: Found. Crystallogr. 1991, 47, 655−685. (546) Sheldrick, G. M. Crystal Structure Refinement with SHELXL. Acta Crystallogr., Sect. C: Struct. Chem. 2015, 71, 3−8. (547) Marsh, R. E.; Sparks, R. A. Space-Group Changes : A Revision to a Revision. Acta Crystallogr., Sect. B: Struct. Sci. 2001, 57, 722. (548) Spek, A. L. Structure Validation in Chemical Crystallography. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2009, 65, 148−155. (549) Spek, A. L. What Makes a Crystal Structure Report Valid? Inorg. Chim. Acta 2018, 470, 232−237. (550) Harrison, W. T. A.; Simpson, J.; Weil, M. Acta Crystallographica Section E: Structure Reports Online: Editorial. Acta Crystallogr., Sect. E: Struct. Rep. Online 2010, 66, E1−E2. (551) Johnson, N. In (crystallographic) data we trust? https://www. ccdc.cam.ac.uk/Community/blog/in-crystallographic-data-we-trust/ (accessed May 2, 2019). AX

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(552) Royal Society of Chemistry. ChemSpider: Search and Share Chemistry. http://www.chemspider.com/Default.aspx (accessed Jan 23, 2019). (553) Rigaku Corporation. Single Crystal Diffraction Software CrysAlis Pro. https://www.rigaku.com/downloads/journal/RJ32-2/ Rigaku%20Journal%2032-2_31-34.pdf (accessed Jan 23, 2019). (554) Reilly, A. M.; Cooper, R. I.; Adjiman, C. S.; Bhattacharya, S.; Boese, A. D.; Brandenburg, J. G.; Bygrave, P. J.; Bylsma, R.; Campbell, J. E.; Car, R.; et al. Report on the Sixth Blind Test of Organic Crystal Structure Prediction Methods. Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2016, 72, 439−459. (555) Cruz-Cabeza, A. J. Crystal Structure Prediction: Are We There Yet? Acta Crystallogr., Sect. B: Struct. Sci., Cryst. Eng. Mater. 2016, 72, 437−438. (556) Nyman, J.; Reutzel-Edens, S. M. Crystal Structure Prediction Is Changing from Basic Science to Applied Technology. Faraday Discuss. 2018, 211, 459−476.

AY

DOI: 10.1021/acs.chemrev.9b00155 Chem. Rev. XXXX, XXX, XXX−XXX