Optimization of Crystal Structures of Archetypical Pharmaceutical

Mar 20, 2013 - All Publications/Website .... the energy rankings, using the software package Quantum Espresso. ... It is also shown that the use of el...
1 downloads 0 Views 4MB Size
Article pubs.acs.org/crystal

Optimization of Crystal Structures of Archetypical Pharmaceutical Compounds: A Plane-Wave DFT‑D Study Using Quantum Espresso Albert M. Lund,† Anita M. Orendt,‡ Gabriel I. Pagola,∥ Marta B. Ferraro,∥ and Julio C. Facelli*,‡,§ Department of Chemistry, ‡Center for High Performance Computing, and §Department of Biomedical Informatics, University of Utah, 155 South 1452 East Room 405, Salt Lake City, Utah 84112-0190, United States ∥ Departamento de Física, IFIBA (CONICET) Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universitaria, Pab. I (1428), Buenos Aires, Argentina Crystal Growth & Design 2013.13:2181-2189. Downloaded from pubs.acs.org by EASTERN KENTUCKY UNIV on 08/16/18. For personal use only.



ABSTRACT: Previously, it was shown that crystal structure prediction based on genetic algorithms (MGAC program) coupled with force field methods could consistently find experimental structures of crystals. However, inaccuracies in the force field potentials often resulted in poor energetic ranking of the experimental structure, limiting the usefulness of the method. In this work, dispersion-corrected density functional theory is employed to improve the accuracy of the energy rankings, using the software package Quantum Espresso. The best choices of running parameters for this application were determined, followed by completion of crystal optimizations on a test set of archetypical pharmaceutical molecules. It is shown here that the variable cell optimization of experimental structures reproduces the experimental structure with high accuracy (RMS < 0.5 Å) for this test set. It is also shown that the use of electronic structure theory based methods greatly improves the energetic ranking of structures produced by MGAC when used with a force field method, such that the experimental match is found with a high degree of accuracy.



INTRODUCTION Over the last several decades, there has been much effort toward the goal of being able to readily and reliably predict, by computational methods alone, the crystal structure of a molecule based only on its chemical diagram.1−5 The process to do this is shown in the TOC graphic. The process of taking a chemical structure to a crystal structure is very computationally intensive as it generally involves the generation and evaluation of tens or hundreds of thousands of trial crystal structures. However, the ability to accomplish this goal has far-reaching implications in many areas. On a basic science level, this can lead to an understanding of the principles that control crystal growth. More practically, the ability to successfully predict crystal structures based on computation alone will have an impact in many industries, including pharmaceutics, agrochemicals, pigments, dyes, and explosives. The current status of CSP can be evaluated by the performance of the participants in the periodic blind tests that have been organized by the Cambridge Crystallographic Data Centre.6−10 There have been five blind tests since 1999, © 2013 American Chemical Society

the latest held in 2011, and we have participated in the last four with our MGAC (Modified Genetic Algorithm for Crystals and Clusters) package.11−15 MGAC is capable of doing CSP for any space group, any number of molecules per asymmetric unit, and is able to deal with conformational flexibility of the molecule being studied in order to explore the entire crystal energy landscape. The generation of trial crystal structures is completed utilizing genetic algorithms. The current version of MGAC relies on the use of the CHARMM16,17 molecular mechanics program using the generalized atomic force field (GAFF)18 for the energy evaluation of the trial structures, but has been shown that the ranking of the structures is not always reliable due to deficiencies of GAFF.15 The results of the last two blind tests showed the advantage of using dispersion-corrected density functional theory (DFTReceived: February 19, 2013 Revised: March 18, 2013 Published: March 20, 2013 2181

dx.doi.org/10.1021/cg4002797 | Cryst. Growth Des. 2013, 13, 2181−2189

Crystal Growth & Design

Article

Figure 1. Molecules in the Karamertzanis and Price31 data set. Also shown are the CSD reference codes and the space group for all polymorphs considered. The important dihedral angles to consider in the search of the crystal system are marked with arrows through the bond.

D)19−21 both to create a molecule-specific tailored force field that is used to generate trial structures and to reorder a subset of the trial structures in search of the lowest-energy crystal structures.5,22−25 There have been two other approaches utilizing first-principles calculations applied to crystal structure prediction of organic crystals recently presented in the literature.26−28 These results lend promise to using DFT-D methods to completely replace molecular mechanics as the method of choice for all of the evaluation of the energies of the trial crystal structures in CSP. This has been thought to be computationally unfeasible; however, with recent advances in

computer technology, we believe the time has come to explore this option. Quantum Espresso (QE)29 is a set of computer codes designated to perform electronic structure calculations based on density functional theory, plane waves, and pseudopotentials that is capable of calculating the energy of and performing local optimizations on crystal systems using DFT-D. Its primary application has been determining the band structure and other properties of semiconductors and other solid-state materials. Although there is a report in the literature showing the results of using QE for the energy determination of a single crystal 2182

dx.doi.org/10.1021/cg4002797 | Cryst. Growth Des. 2013, 13, 2181−2189

Crystal Growth & Design

Article

Table 1. Parameters Explored to Determine the Optimal Conditions for QE Optimizationsa parameter conv_thr ecutwfc forc_conv_thr etot_conv_thr k-points a

description

range

minimum error threshold for self-consistency kinetic energy cutoff (Ry); ecutrho, the kinetic energy cutoff for charge density and potential functions, was always 10 times the value of ecutwfc force threshold for structural relaxation energy threshold for structural relaxation grid of discrete points used in integration during DFT-D calculation

final

10−4 to 10−12 30−70

10−7 55

10−1 to 10−4 10−2 to 10−6 1 × 1 × 1, 2 × 2 × 2, 3 × 3 × 3, and 4 × 4 × 4

10−2 10−3 2×2×2

The values in the column labeled final are the ones used in the calculations presented.

Table 2. Summary of QE Results for the K&P Data Set,31 Including Information on the Size of the Unit Cell (Number of Atoms, Electrons, and Molecules) and the Time for the Optimizationf molecule/system infoa

space group

metrics

name

group

Natoms

no. of e−

N

Kpt

VC

T (hr)

energy (Ry)

RMSb

ACSALA05 ACSALA13 ACYGLY11 ATUVIU BANGOM01 BANGOM02 BZAMID02 BZAMID06 CBMZPN03 CBMZPN10 CBMZPN11 CBMZPN12 CBMZPN16 CBOHAZ02 CERNIW CYACHZ01 EYOBAV GAHPIO HAMTIZ HBIURT10 HISTAN HUYYOP IBPRAC01 JEKNOC11 KAMREW KAYTUZ LEKRIC LEKROI NMACEP01 NMACEP02 NOREPH01 NOZKES PEAMAN01 PMACEP01

P21/c P21/c P21/c P21212 P21 P212121 P21/c Pba2 R3̅ P21/n P1̅ C2/c Pbca P21/c P21/n P21/c Cc P21/a P21/n P212121 P21 P212121 P21/c P21 P212121 P21/c P212121 P1 P212121 P212121 P21/c P212121 P21 P21

84 84 60 72 52 104 64 128 324 120 240 240 240 48 116 48 140 88 104 52 34 128 132 132 132 96 156 36 164 164 96 40 78 82

272 272 304 208 208 416 184 368 1584 352 704 704 704 144 304 152 400 256 360 184 88 328 324 324 384 360 408 102 392 392 240 104 210 196

4 4 4 4 2 4 4 8 18 4 8 8 8 4 4 4 4 4 4 4 2 4 4 4 4 4 4 1 4 4 4 4 2 2

4 4 2 4 2 2 4 1 12 2 4 4 2 4 2 4 4 4 4 1 2 4 4 4 4 4 4 4 4 4 4 4 4 4

14 25 23 31 32 31 25 37 ? 40 31 34 33 55 22 23 32 40 31 47 33 36 24 42 17 34 24 30 29 37 23 19 21 41

2.6 4.7 1.4 3.1 2.3 10.6 2.0 21.7c too complex 9.5 71.9 38.1d 6.1e 1.8 3.9 1.0 7.6 5.9 11.5 1.4 0.4 11.8 7.7 14.5 6.0 7.4 11.0 0.5 18.6 22.4 4.0 0.3 2.6 5.6

−239.6195421 −239.6193637 −169.5440140 −183.3139758 −450.5943046 −450.5914761 −139.8490878 −139.8311748

0.097 0.170 0.188 0.128 0.196 0.331 0.201 0.323

−256.7059809 −256.7025145 −256.7010208 −256.6989239 −129.8488205 −226.8550397 −131.5329627 −336.3618078 −228.8603043 −331.8894879 −184.0473440 −127.2667014 −218.1767345 −233.3437649 −233.3415498 −354.6493038 −227.4160927 −339.3602102 −368.3247454 −302.3179367 −302.3254026 −169.7442352 −93.7290501 −320.4332603 −302.3196545

0.232 0.182 0.219 0.132 0.409 0.173 0.063 0.436 0.213 0.133 0.459 0.260 0.270 0.206 0.349 0.056 0.188 0.099 0.132 0.167 0.225 0.201 0.134 0.070 0.234

The number of atoms (Natoms), electrons (no. of e−), and molecules (N) reported are per unit cell. Kpt is the number of symmetrized k-points, and VC is the number of optimization steps needed to reach convergence. bThe RMS values are for comparisons without hydrogens completed using the crystal packing similarity function of Mercury. cTimes for this structure are for a run using two 12-core nodes. dTimes for this structure are for a run using 12 8-core nodes. eTimes for this structure are using a single 8-core node. fThe energy reported is on a per molecule basis, and the RMS is based on a 15 molecule match with the experimental structure. a

system,30 to the authors’ knowledge, no comprehensive study of the performance of QE for the optimization of organic crystal structures has been reported in the literature. The data set chosen for this comprehensive study of the QE performance for CSP is the Karamertzanis and Price (K&P) data set,31 which was initially selected as a proxy representation for characteristic molecules of pharmaceutical interest and

subsequently was used by us to test the reliability of MGAC crystal structure predictions.15 This set, shown in Figure 1, contains molecules that represent a variety of pharmaceutically relevant functional groups, as well as five compounds that present experimentally determined polymorphs and three salts. It should be noted that two additional polymorphs (BZAMID06 and CBMZPN16) that were not treated in either 2183

dx.doi.org/10.1021/cg4002797 | Cryst. Growth Des. 2013, 13, 2181−2189

Crystal Growth & Design

Article

Figure 2. Comparison between the experimental and the optimized crystal structures for NOZKES (top) and HBIURT10 (bottom). The comparisons on the left are between the experimental and the QE-optimized structures (shown in blue), with an RMS of 0.134 Å for NOZKES and 0.459 Å for HBIURT10. The comparisons on the right are between the experimental and the QE-reoptimized MGAC-CHARMM match (shown in green), with an RMS of 0.23 Å for NOZKES and 0.46 Å for HBIURT10. Note that the orientations are different due to the choice of molecules used in the comparison made by the comparison program.

Figure 3. Energies for the BANGOM initial MGAC population relative to the energy of the QE-optimized experimental structure. The structures are ranked in order of increasing initial energy (blue points), and the best structure after optimization is indicated by the gray vertical line. The order of the optimized energies (red points) does not correspond to the order of the initial single-point energies (blue points).

the energy evaluation of the GA generated trial crystal structures. The first step of this process is the determination of the parameters necessary to reproduce the experimental structures for each of the K&P molecules without trading performance, followed by full crystal optimizations of the members of this data set starting from the experimentally determined crystal structures. This is followed by full crystal optimizations on MGAC derived initial populations of several members of the K&P data set to explore QE’s performance and results when the calculations are not started from structures with experimentally determined crystal structures, as is typically in the CSP process. Finally, QE is also used to rerank several sets of the lowest-energy MGAC-CHARMM structures from previous studies on these systems to explore the reliability of the QE crystal energy rankings.

the original K&P paper or our previous work were added. In our previous work, we attempted to predict 22 of these structures using MGAC and were successful in obtaining structures for 16, with matches not being found for ATUVIU, BANGOM02, CERNIW, ACSALA05, IBRAC01, and JEKNOC11. However, in the majority of these successful predictions, the rank of the best match was often well outside the expectations of the blind test criterion. We attributed this to inaccuracies in the force field, as well as bias introduced by CHARMM optimizations. In five of the six cases (all except for BANGOM02) where a match to the experimental structure was not found, potential failures of the GAFF to properly handle the intermolecular interactions were identified. In this work, full crystal optimizations of the crystal structures in the K&P data are completed using the DFT-D method within QE to determine the feasibility of using QE for 2184

dx.doi.org/10.1021/cg4002797 | Cryst. Growth Des. 2013, 13, 2181−2189

Crystal Growth & Design



Article

METHODS

All calculations were performed using version 5.0.1 of Quantum Espresso. The vc-relax option, which allows for optimization of the unit cell parameters along with all atomic coordinates, was used. Calculations were performed on either 12-core nodes (two socket, 6 core Xeon X5660 processors, 2.8 GHz) with 24 GB RAM or 8-core nodes (two sockets, quad core Xeon E5462 processors, 2.8 GHz) with 16 GB RAM. Calculations were performed on a single node, which was found to give the best hardware utilization, except in one case, where wall time limits imposed in the batch system necessitated the use of two nodes. All calculations to determine run times for the crystal structure optimizations were performed on the 12-core nodes. The QE parameters explored in this study are discussed in the next section. Intermediate processing of input and output files was completed using custom Python scripts. RMS values were computed using the Solid Form Crystal Packing Similarity method in Mercury CSD, using 15 molecules for comparison and ignoring hydrogen atoms.32 The current version of MGAC, which uses CHARMM16,17 to optimize crystal structures, is referred to as MGAC-CHARMM in the paper and was used to generate the initial populations of 30 trial crystal structures; the final populations used in the energy reranking calculations were taken from our earlier study.15

Figure 5. Reordering based on reoptimization of KAYTUZ MGACCHARMM crystal structures with QE. The QE column has two fewer entries due to an unphysical CHARMM structure (see text) and one structure that did not converge among the 51 lowest-energy structures. The blue indicates the match to the experimental structure, whereas the red indicates the remainder of the top 10 lowest QE energy structures.



RESULTS AND DISCUSSION Determination of Parameters. A series of preliminary calculations were performed on four representative molecules Table 3. Results of Reranking MGAC-CHARMM LowestEnergy Structures Using QE Optimization MGAC-CHARMM resultsa

a

case

rank of match

NOZKES KAYTUZ BZAMID02 HBIURT10

78 22 39 106

Table 1. Along with these parameters, different pseudopotentials were explored. A two-step process of selecting these parameters was used. First, each parameter listed was varied independently. After selecting a subset of values from each range, a grid test was performed over those ranges to assess if any interdependencies existed between parameters. The results were analyzed in terms of the RMS relative to the experimental structures and by the time it took to complete the calculation. Initial calculations determined that ultrasoft pseudopotentials gave better results than norm-conserving ones. After a search of existing pseudopotentials for use in QE, three different ultrasoft pseudopotential combinations were chosen for testing: 1. The Vanderbilt33 PBE34,35 pseudopotential (Van-PBE), which was available for all elements in the molecules in our test set. 2. A mix of the RRKJUS36 PBE pseudopotential (RRKJUSPBE), which exists for all elements needed for the test set except F, with the Van-PBE used on the F.

QE reoptimization results

RMS

no. of structures reoptimized

rank of match

RMS

energy relative to QE opt of exp (kJ/mol)

0.18 0.44 0.67 0.32

111 51 53 171

1 1 3 2

0.23 0.42 0.46 0.46

0.49 0.61 2.63 0.13

Taken from ref 15.

from the K&P data set to determine the optimal conditions for the calculations. The parameters varied along with the range explored, and the final parameters chosen for use are given in

Figure 4. Results of reoptimization of MGAC-CHARMM NOZKES results with QE. The CHARMM ranking is indicated by the x-axis scale. The QE energies are for a single molecule and given relative to the energy of the QE optimization of the experimental structure. The gray vertical line marks the MGAC-CHARMM match to the experimental structure. The blue points are the QE energy of the CHARMM optimized structure, and the red points are the energy after QE optimization. 2185

dx.doi.org/10.1021/cg4002797 | Cryst. Growth Des. 2013, 13, 2181−2189

Crystal Growth & Design

Article

Figure 6. Sections of the three lowest-energy crystal structures from the BZAMID02 reranking based on the energies after optimization with QE. The black structure is the experimental match, ranked third in energy, whereas the blue is the lowest-energy structure and the red is the second lowest-energy structure. Note that the center two rows of all three consist of a plane of benzamide molecules that are packed in the same manner, but there are differences with the orientation of the molecules above and below this center plane between the three structures. Also shown is an overlay of these three structures, showing the differences in the packing. In this overlay, the aligned molecules correspond to the center row molecules of the other views.

3. The Vanderbilt BLYP37,38 pseudopotential (Van-BLYP), which does not include F; therefore, the evaluation of this pseudopotential did not include the two BANGOM polymorphs. It was determined that pseudopotentials utilizing the BLYP cross-correlation function produce the best RMS values overall. However, as QE-compatible pseudopotentials utilizing BLYP are not available for F, an element commonly used in medicinal chemistry for the synthesis of pharmaceutical compounds, the combination of the RRKJUS-PBE for C, H, N, and O along with the Van-PBE for the F was selected, as it gave the second best RMS results among the three tested. The results of the testing show that, after the choice of pseudopotential, the parameters that most directly affect RMS are the number of k-points, and the energetic cutoffs (ecutwfc and ecutrho). As the choice of the self-consistency threshold (conv_thr) had little effect on the RMS, it was set to correspond to an energy of 0.13 J/mol, which should allow polymorphs to be energetically distinguished.39 The convergence thresholds for vc-relax (forc_conv_thr and etot_conv_thr) were also found to have minimal impact on the final RMS, so these terms were set to favor shorter simulation times. Although higher energy cutoffs did result in lower RMS values,

a cutoff of 55 Ry was chosen as a good compromise between the RMS and the time for the calculation to complete. Therefore, all of the QE optimizations were completed as full crystal optimizations (vc-relax) using the RRKJUS-PBE for C, H, N, and O along with the Van-PBE for the F, the London option set to select the DFT-D implementation in QE, along with the final input parameters listed in Table 1. It should also be noted that several test cases were completed doing the optimization in two stages, the first with the cell fixed (relax) calculation, followed by a vc-relax, to separate the optimization of inter- and intramolecular interactions, and for the cases studied, no differences were observed.40 Optimization of Experimental Crystal Structures. Once the optimum choice of parameters was determined for the selected pseudopotential, full local DFT-D crystal optimizations using the experimental structure as the initial structure were performed on each of the systems shown in Figure 1. The only structure for which the vc-relax calculation was not completed was CBMZPN03, an exceptionally large crystal structure with 18 molecules (or 1584 valence electrons) in the unit cell. The results of these calculations are shown in Table 2. In all cases where the simulation completed, QE returned a final 2186

dx.doi.org/10.1021/cg4002797 | Cryst. Growth Des. 2013, 13, 2181−2189

Crystal Growth & Design

Article

paper15 were reoptimized using QE. In all cases, a reranking based on the QE final energy after local optimization placed the match to the experimental structure as one of the lowest-energy structures. Table 3 summarizes the results obtained, including the number of structures reoptimized, the QE-optimized ranking of the MGAC-CHARMM match, and the RMS with respect to the experimental crystal structure for the QE structures. For NOZKES, the CHARMM-MGAC results found a match at energy rank 78 with an RMS of 0.18 Å. The first 111 crystal structures predicted by CHARMM-MGAC were reoptimized by QE, and the match was found to be the lowest energy, at 0.49 kJ/mol relative to the energy of the QE-optimized experimental structure; this structure has an RMS of 0.23 Å relative to the experimental structure (a comparison is shown in Figure 2, on the top right). It should also be noted that 2 of these 111 structures, CHARMM structures ranked 68 and 74, had unphysical structures, an issue found when using CHARMM, and therefore, QE calculations were not performed on them (routines to remove unphysical structures based on molecular volume and bad atom−atom contacts have been added to MGAC since these data sets were obtained for the previous paper). The results of the QE calculations are shown in the graph in Figure 4, where the QE energy of each of the 111 structures is shown for the first optimization step (blue) and for the optimized structure (red), with the energies being reported relative to the energy of the QE-optimized experimental structure. A comparison of the red and blue lines confirms that the single-point QE energy of the nonoptimized MGAC-CHARMM final structures is insufficient for reranking, in agreement with the analysis in the previous section. Another way to view the reordering of the CHARMMMGAC structures upon optimization using QE is shown in Figure 5 for the KAYTUZ results. The structure with the best match to the experimental one, ranked 22 with CHARMM, became the lowest-energy structure after reoptimization using QE (shown in blue in the figure) with an energy of 0.61 kJ/mol relative to that of the QE-optimized structure. The unpredictable nature of the reordering is shown by the connection of the 10 lowest-QE structures, shown in red and blue. The remaining two cases tested show similar energy reranking characteristics. In the case of HBIURT, the structure that matched the known crystal structure HBIURT10 best was ranked 106 by the CHARMM energies, but after QE optimization, it became the second lowest in energy, with both structures within 0.13 kJ/mol of the QE energy of the experimental structure. Both of the two QE lowest energy structures were shown to be matches (using both a 15- and a 30-molecule match set), but the lowest-energy structure had an RMS of 4.81 Å, whereas the second lowest had an RMS of 0.46 Å (match shown in Figure 2, on the bottom left). Closer inspection revealed that these two structures are conformational polymorphs with very similar crystal structures that are related by inversion. For BZAMID02, the match to the experimental structure was found to be third, at 2.63 kJ/mol relative to the QE energy of the experimental crystal structure, with two different herringbone structures coming in at lower energy, −0.09 and 0.18 kJ/mol relative to the QE energy of the experimental structure. The three lowest-energy BZAMID02 structures display a high level of similarity to each other, with the first two showing partial matches of 12 of out 15 when compared to

structure that gave an RMS < 0.5 Å, with 15 out of 15 molecules aligned when compared to the experimental structure. The RMS difference between the QE full optimization and the experimental structure ranged from 0.056 to 0.459 Å, with a median RMS of 0.196 Å. Two typical matches, using the cases of NOZKES and HBIURT10, are shown in Figure 2, on the left. These results imply that the experimental structure is at least close to local minima of the QE energy hypersurface and that, for unknown structures, MGAC-QE most likely will find the experimental structures, provided that the GA generates structures in its proximity. This is a strong indication that the MGAC-QE combination could be successful for global optimization of crystal structures. The wall time of the calculation is primarily affected by the number of valence electrons (which correspond to the Kohn− Sham states in the DFT method), the number of k-points, and the number of vc-relax iterations that are needed to reach convergence. From a theoretical standpoint, the calculation time scales linearly with number of electrons when normalizing against vc-relax iterations, but in practice, some variation exists due to differences in convergence times for each self-consistent step. From a technical standpoint, systems with high numbers of valence electrons (>300 in our approximation) are probably not within the realm of feasible use with CSP on commonly available hardware, due to the large number of optimizations performed in the CSP process. Test of Initial MGAC Populations. To further test the validity of using QE for the energy evaluation step in MGAC (step 3 of the MGAC CSP process outlined in the introduction), full crystal optimizations were done on initial MGAC populations of 30 crystal structures of a given space group for three of the K&P molecules, namely, ATUVIU (P212121), BANGOM (P21), and IBPRAC/JEKNOC (P21/c). Because of the variations in the starting structures and the number of iterations needed to reach convergence, the time for the optimization varied among the 30 crystal structures for each molecule. However, in each case, the average time was not significantly different than the time for the optimization required when starting from the experimental structure. The average run time for the 30 structures was 2.1 h for ATUVIU, 2.5 h for BANGOM, and 7.6 h for IBPRAC/JEKNOC. As expected, the diversity of the initial population for each structure was maintained after optimization, with a spread of optimized energies between 47 and 64 kJ/mol. In every case, the optimized experimental structure (Table 2) was lower in energy that any of the other locally optimized structures obtained from the members of this initial population. Figure 3 shows the BANGOM initial population with the energies before and after local optimization, ranked in order of the initial energy. Note that the energies do not decrease in a uniform fashion during optimization; this is highlighted by the fact that the structure ranked number 7 in the initial population has the lowest energy after local optimization. This indicates that the energy of the structure prior to the local optimization cannot be used as a proxy for the final rank of the optimized structures, and this requires that all structures be optimized before rank comparison. Reranking of MGAC-CHARMM Final Populations. Previous CSP results have established that the use of molecular mechanics produces unreliable energy. Therefore, the accuracy of the QE energy rankings should be explored. A subset of the final structures of the MGAC-CHARMM runs for NOZKES, KAYTUZ, HBIURT, and BZAMID reported in our previous 2187

dx.doi.org/10.1021/cg4002797 | Cryst. Growth Des. 2013, 13, 2181−2189

Crystal Growth & Design

Article

(8) Day, G. M.; et al. Significant progress in predicting the crystal structures of small organic moleculesA report on the fourth blind test. Acta Crystallogr., Sect. B: Struct. Sci. 2009, 65, 107−125. (9) Bardwell, D. A.; et al. Towards crystal structure prediction of complex organic moleculesA report on the fifth blind test. Acta Crystallogr. 2011, B67, 535−551. (10) Lommerse, J. P. M.; et al. A test of crystal structure prediction of small organic molecules. Acta Crystallogr. 2000, B56, 697. (11) Bazterra, V. E.; Ferraro, M. B.; Facelli, J. C. Modified genetic algorithm to model crystal structures. I. Benzene, naphthalene and anthracene. J. Chem. Phys. 2002, 116, 5984−5991. (12) Bazterra, V. E.; Ferraro, M. B.; Facelli, J. C. Modified genetic algorithm to model crystal structures. II. Determination of a polymorphic structure of benzene using enthalpy minimization. J. Chem. Phys. 2002, 116, 5992−5995. (13) Bazterra, V. E.; Ferraro, M. B.; Facelli, J. C. Modified genetic algorithm to model crystal structures. III. Determination of crystal structures allowing simultaneous molecular geometry relaxation. Int. J. Quantum Chem. 2004, 96, 312−320. (14) Bazterra, V. E.; et al. A distributed computing method for crystal structure prediction of flexible molecules: An application to N-(2dimethyl-4-5-dinitrophenyl) acetamide. J. Chem. Theory Comput. 2007, 3, 201−209. (15) Kim, S.; et al. Crystal structure prediction of flexible molecules using parallel geneic algorithms with standard force field. J. Comput. Chem. 2009, 30, 1973−1985. (16) Brooks, B. R.; et al. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 1983, 4, 187−217. (17) MacKerell, A. D., et al. CHARMM: The Energy Function and Its Parameterization with an Overview of the Program. In The Encyclopedia of Computational Chemistry; von Ragué Schleyer, P., Ed.; John Wiley & Sons: Chichester, U.K., 1998, pp 271−277. (18) Wang, J.; et al. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25, 1157. (19) Grimme, S. Accurate description of van der Waals complexes by density functional theory including empirical corrections. J. Comput. Chem. 2004, 25, 1463−1473. (20) Grimme, S. Semiempirical GGA-type density functional constructed with a long-range dispersion correction. J. Comput. Chem. 2006, 27, 1787−1799. (21) Grimme, S.; et al. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 2010, 132, 154104−154123. (22) Neumann, M. A.; Perrin, M. A. Energy ranking of molecular crystals using density functional theory calculations and an empirical van der Waals Correction. J. Phys. Chem. B 2005, 109, 15531−15541. (23) Neumann, M. A. Crystal Structures of Moderately Complex Organic Molecules are Predictable. In Advanced Computational Methods in Structural Chemistry, 24th European Crystallographic Meeting, Micro Symposium 14, Marrakech, Morocco, Aug 22−27, 2007; pp 11H00−11H20. (24) Neumann, M. A.; Leusen, F. J. J.; Kendrick, J. A major advance in crystal structure prediction. Angew. Chem., Int. Ed. 2008, 47, 2427− 2430. (25) Neumann, M. A. Tailor-made force fields for crystal-structure prediction. J. Phys. Chem. B 2008, 112, 9810−9829. (26) King, M. D.; et al. Prediction of the unknown crystal structure of creatine using fully quantum mechanical methods. Cryst. Growth Des. 2011, 11, 5733−5740. (27) Zhu, Q.; et al. Constrained evolutionary algorithm for strucutre prediction of molecular crystals: Methodology and applications. Acta Crystallogr. 2012, B68, 215−226. (28) Zhu, Q.; Oganov, A. R.; Lyakhov, A. O. Evolutionary metadynamics: A novel method to predict crystal structures. CrystEngComm 2012, 14, 3596−3601. (29) Giannozzi, P.; et al. QUANTUM ESPRESSO: A modular and open-source software project for quantum simulations of materials. J. Phys.: Condens. Matter 2009, 21, 395502.

the experimental structure. Figure 6 shows a comparison of these three different crystal structures. All three share the same planar structure in the central two rows of benzamide molecules, but the longer range order around these two rows changes between them. Furthermore, the overlay of the three structures given in Figure 6 shows the relationship between the differences longer range ordering. There are two experimental structures in the CDCC database; the absence of other experimental configurations may indicate that the growth of the crystal favors only this one configuration.



CONCLUSIONS The findings presented in this paper are consistent with other work in which electronic structure optimizations are being used to correctly rank structures produced by CSP methods using force field approaches. We have shown that this method is equally applicable to our GA-based method and that DFT-D optimization can be used to obtain high-quality structures consistently. Furthermore, we have shown that highly similar structures can be distinguished in terms of their energy by this method, as seen in the cases of BZAMID02 and HBIURT10. While the timings reported indicate that it will be timeconsuming and costly, doing completely ab initio CSP by using QE energies to determine the fitness of the GA generated trial structures is feasible with current computer hardware, as least for smaller molecular systems. Once an integration of MGAC and QE is complete, more exploration on limitations due to size of the system and any additional possible restrictions on the search due to the computational cost will need to be completed. We fully expect that this integration of MGAC and QE will provide a useful tool to the scientific community as there currently exists no freely available, quality CSP software.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS Support and resources from the Center for High Performance Computing at the University of Utah is gratefully acknowledged. M.B.F. and G.I.P. acknowledge the support from the University of Buenos Aires and CONICET. For information on the MGAC software, please contact the corresponding author.



REFERENCES

(1) Day, G. M. Crystal Structure Prediction. In Supramolecular Chemistry: From Molecules to Nanomaterials; Gale, P. A., Steed, J. W., Eds.; John Wiley & Sons: Hoboken, NJ, 2012; pp 2905−2925. (2) Lehmann, C. W. Crystal structure predictionDawn of a new era. Angew. Chem., Int. Ed. 2011, 50, 5616−5617. (3) Price, S. L.; Price, L. S. Computational Polymorph Prediction. In Solid State Characterization of Pharmaceuticals; Storey, R. A., Ymén, I., Eds.; John Wiley & Sons, Ltd.: Chichester, U.K., 2011; pp 427−450. (4) Day, G. M. Current approaches to prediction molecular organic crystal strucutres. Crystallogr. Rev. 2011, 17, 3−52. (5) Kendrick, J.; et al. Progress in crystal structure prediction. Chem.Eur. J. 2011, 17, 10736−10744. (6) Motherwell, W. D. S.; et al. Crystal structure prediction of small organic molecules: A second blind test. Acta Crystallogr., Sect. B 2002, 58, 647−661. (7) Day, G. M.; et al. A third blind test of crystal structure prediction. Acta Crystallogr., Sect. B 2005, 61, 511−527. 2188

dx.doi.org/10.1021/cg4002797 | Cryst. Growth Des. 2013, 13, 2181−2189

Crystal Growth & Design

Article

(30) Habgood, M.; et al. Testing a variety of enectronic-structurebased methods for the relative energies of 5-formyluracil crystals. J. Chem. Theory Comput. 2011, 7, 2685−2688. (31) Karamertzanis, P. G.; Price, S. L. Energy minimization of crystal structures containing flexible molecules. J. Chem. Theory Comput. 2006, 2, 1184−1199. (32) Chisholm, J. A.; Motherwell, S. COMPACK: A program for identifying crystal structure similarity using distances. J. Appl. Crystallogr. 2005, 38, 228−231. (33) Vanderbilt, D. Soft self-consistent pseudopotentials in a generalized eigenvalue formalism. Phys. Rev. B 1990, 41, 7892−7895. (34) Perdew, J. P.; Burke, K.; Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 1996, 77, 3865−3868. (35) Perdew, J. P.; Burke, K.; Wang, Y. Generalized gradient approximation for the exchange-correlation hole of a many-electron system. Phys. Rev. B: Condens. Matter Mater. Phys. 1996, 54, 16533− 16539. (36) Rappe, A. M.; et al. Optimized pseudopotentials. Phys. Rev. B 1990, 41, 1227. (37) Becke, A. D. Density-functional exchange-energy approximation with correct asymptotic behavior. Phys. Rev. A 1988, 38, 3098−100. (38) Lee, C.; Yang, W.; Parr, R. G. Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. Phys. Rev. B: Condens. Matter Mater. Phys. 1988, 37, 785. (39) Yu, L.; Huang, J.; Jones, K. J. Measuring free-energy difference between crystal polymorphs through eutectic melting. J. Phys. Chem. B 2005, 109, 19915−19922. (40) van de Streek, J.; Neumann, M. A. Validation of experimental molecular crystal structures with dispersion-corrected density functional theory calculations. Acta Crystallogr. 2010, B66, 544−558.

2189

dx.doi.org/10.1021/cg4002797 | Cryst. Growth Des. 2013, 13, 2181−2189