Beyond the Numbers: Charting Chemical Reaction Space - Organic

Dec 18, 2012 - ... F. Javier Moreno-Dorado , Francisco M. Guerra , and Guillermo M. Massanet .... E. Conterosito , M. Milanesio , L. Palin , V. Gianot...
0 downloads 0 Views 258KB Size
Concept Article pubs.acs.org/OPRD

Beyond the Numbers: Charting Chemical Reaction Space Paul M. Murray, Simon N. G. Tyler, and Jonathan D. Moseley* CatScI Ltd., CBTC2, Capital Business Park, Wentloog, Cardiff CF3 2PX, United Kingdom ABSTRACT: We present here an informed estimate of the millions of parameter settings that might be required to optimise one typical transition-metal-catalysed reaction. We describe briefly how both Design of Experiments (DoE) and Principal Component Analysis (PCA) techniques may be combined to reduce the number of potential reaction settings to a practical number of experiments without losing critical information. A key feature of this approach is the ability to relate discrete or discontinuous parameters to one another. The methodology is presented so that any reaction may be assessed in a similar way. We believe this represents for the first time an informed estimate of the number of potential permutations that are possible for these types of reactions in particular, and therefore the enormity of the task in optimising them. The powerful combination of DoE and PCA applied systematically and in an experimentally directed approach is beneficial for optimising reactions, particularly challenging transition-metal-catalysed reactions. However, this approach is beneficial to all reactions, especially when dealing with discrete parameters, such as solvents for example.



INTRODUCTION All chemists appreciate that a potentially large number of parameters can affect the performance and outcome of a given chemical reaction. Obvious physical factors which can affect nearly all reactions are concentration, pH, temperature and time.1 Many reactions are bimolecular or higher order, and therefore the stoichiometry between the two starting materials is likely to be critical. Reactions involving gases will be dependent on the gas pressure above a liquid or solid, or the solubility of gas in liquid, in which case pressure effectively becomes a surrogate measure of concentration or stoichiometry. Reactions involving suspended solids may be sensitive to the surface area and other properties of the solid,2 in addition to the type and speed of agitation and vessel geometry,3 which along with other mixing effects more familiar to engineers often affect reactions on scale-up. And an often overlooked but very important factor is the choice and nature of the solvent.4−6 All these parameters will play a role in the reaction, although typically for a noncatalytic reaction, often only a few might be significant (Table 1). The parameters listed in Table 1 can be separated into two types: continuous parameters and discontinuous or discrete parameters. Most of the parameters listed are continuous,7 which means there is an infinite choice of settings, albeit allowing for some obvious physical limitations such as, for example, the freezing point of the solvent.8 Recognising, understanding, and controlling these parameters at their potentially multitudinous levels will give the best opportunity for achieving the desired reaction outcome, and chemists have proven adept at achieving this, both in academic synthesis9 and large-scale manufacture, including in catalysis.10 In transition-metal-catalysed reactions however, the challenge is greatly increased by the larger number of parameters that need to be assessed, and the fact that most of these additional parameters are also discrete. Thus, in addition to those listed in Table 1 for a noncatalysed process, the additional parameters for a metal-catalysed process would include metal precatalyst, catalyst loading, choice of ligand, and metal-to-ligand ratio. For © XXXX American Chemical Society

Table 1. Parameters that may affect a reaction Typical Noncatalytic Reaction continuous parameters

discrete parameters

addition rate (reagent) agitation rate concentrationa pHa pressure stoichiometry (reagent)a surface area (reagent) temperaturea timea Typical Catalytic Reaction

agitation method order of addition (reagent) reagent identitya solvent identitya vessel geometry

continuous parameters

discrete parameters

a

catalyst:additive ratio catalyst formation time catalyst:reagent ratioa metal:ligand ratioa stoichiometry (additive)a a

additive identitya base identitya catalyst identitya catalyst addition method ligand identitya

Factors likely to be significant for a Suzuki reaction.

homogeneous catalysts, which are often best generated separately from the main reaction substrates, formation time for the catalyst−ligand complex and the mode of addition to the main vessel may be additional parameters. Catalytic reactions also may be more susceptible to solvent and additive effects, due to coordination with the catalyst−ligand complex; however, this can be due to other reasons, especially with additives.11 If additives are required, the nature and stoichiometry of these are also likely to be important. These additional parameters are also listed in Table 1, along with an indication of whether they are continuous or discrete. Given the large numbers of precatalysts, ligands, additives, and solvents Received: October 3, 2012

A

dx.doi.org/10.1021/op300275p | Org. Process Res. Dev. XXXX, XXX, XXX−XXX

Organic Process Research & Development

Concept Article

place, and even if it had not, an experienced chemist employing a well-known reaction like the Suzuki reaction should be able to propose sensible parameter settings on the basis of literature data. If “time” has been eliminated from the consideration, this gives 128 (i.e., 27) permutations already, probably more than most chemists would want to explore practically.16 However, the numbers really increase when trying to take the discrete parameters into account. The Significance of the Discrete Parameters. The important aspect of the four discrete parameters is that they are by nature discontinuous with respect to one another. Although it might be argued that one metal−ligand combination can be related to another using the same metal and a similar ligand, perhaps by comparing their cone or bite angles, for example,17 this will only be valid for closely related metal−ligand combinations. How will they be compared when the bite angle gets either too large or small to function; or when a different metal is used; or a different class of ligand is added which does not have a bite angle at all? The discontinuity is more obviously illustrated when considering solvents. How does one compare such different solvents as hexane and DMSO for example? Comparing their relative polarities, which is often related to solubility, might be one way, but this may not be the most important factor for the reaction in question. Additionaly, this simple trend may not be relevant in any case, given for example that chlorinated solvents have good solubilising power despite their low-to-medium polarity (as judged by their dielectric constant).18 On the other hand, the high polarity and solubilising power of alcohols might be compromised in the desired reaction by their protic nature, whilst the “unrelated” hexane and DMSO are both aprotic. In summary, replacing any one of the discrete parameters with another does not yield a different value on the same axis of a graph, as it would for a continuous parameter; instead it requires a different graph with different axes which may have no meaningful relationship to the first one whatsoever. This means that every single combination of catalyst/ligand/base/solvent is essentially a dif ferent reaction for the same two starting materials to produce the same product. For a homogeneous transition-metal-catalysed reaction, of which the Suzuki reaction is typical, it is the metal−ligand combination that constitutes the active catalyst. Some metal− ligand combinations are commercially available, but they are often better generated in situ.12 Considering just one metal for the reaction (i.e., Pd), this can effectively be supplied in one of two sources, a Pd(II) precatalyst (e.g., Pd(OAc)2) or a prereduced Pd(0) source (e.g., Pd2dba3). Either Pd source can be combined in situ with any of the commercially available ligands which could reasonably be considered for this reaction, of which there are over 500,19 not counting the thousands of academically prepared and tested ligands.20 There are also a large number of bases, both organic and inorganic, which can be used for the reaction. Fortunately, these are usually less critical in affecting the reaction, so the choice could reasonably be cut to just four, two organic, two inorganic, one strong and one weak base of each. Lastly, there are in excess of 100 possible commercially available reaction solvents.21 Any combination of these parameters might just give the desired reactivity and selectivity for the requisite pair of substrates in Scheme 1. So, given this, how many distinct reactions are there for our generic Suzuki reaction? Just How Many Reactions are There? For illustrative purposes, let us continue from the 128 experiments required for

available, there are therefore a huge number of permutations for any given transition-metal-catalysed reaction.12 This article presents for the first time, we believe, an informed estimate of just how large the number of permutations is for transition-metal-catalysed reactions. We will show a worked example for one common transition-metalcatalysed reaction (the Suzuki−Miyaura reaction) which is representative of the class. We then discuss briefly how Design of Experiments (DoE) can be applied to reduce the number of experiments required to explore the continuous parameters; and how Principal Component analysis (PCA) can be applied to reduce the number of experiments required to explore the discrete parameters. These two techniques can be used simultaneously to manipulate the vast number of possible reactions into an extremely informative and manageable experimental programme of which the practical laboratory chemist can take advantage.13 Whilst this approach is particularly relevant to transition metal catalysis, it is equally applicable to other spheres of synthetic chemistry.



DISCUSSION An Example Reaction. Let us take as the starting point the versatile and widely used Suzuki−Miyaura reaction14 between a simple aromatic halide (1) and boronic acid (2) to give in this case the biphenyl adduct (3) (Scheme 1).15 The significant Scheme 1. Typical generic Suzuki reactiona

a

X = halide or pseudohalide.

parameters for the Suzuki reaction are typically: stoichiometry (reactants), stoichiometry (base), stoichiometry (water), precatalyst loading, metal:ligand ratio, concentration, temperature, time; and the identity of the precatalyst, ligand, base, and solvent (Table 1, footnote a). Ignoring for the moment that the last four are, in fact, discrete parameters, this gives 12 parameters in total, which, even if they were investigated at only two settings each, would give 4096 (i.e., 212) unique experiments. However, “time” can be removed from the list because it is relatively easy to take multiple samples from the same reaction, especially using automation technology. In fact, this provides much more data (in the form of reaction profiles) and therefore greater reaction understanding from essentially the same number of reactions, but requires no additional practical work. It does require more analytical capability and processing time, both of which fortunately can be automated. So if multiple sampling is possible, “time” can be treated differently and removed from consideration of the number of reaction parameters that require investigation. Re-examination of the list reveals that the first seven parameters are continuous, whereas the last four (the choice of the precatalyst, ligand, base, and solvent) are discrete parameters that can only be defined by their nature. Considering the continuous parameters only and applying a typical DoE approach would suggest testing each at two settings to get an understanding of their relative importance. It is likely that some scouting work would already have taken B

dx.doi.org/10.1021/op300275p | Org. Process Res. Dev. XXXX, XXX, XXX−XXX

Organic Process Research & Development

Concept Article

and scientifically rational manner. The inclusion of a small subset of repeat experiments provides an accurate measure of experimental variation and reproducibility in the reactions. Statistical analysis then allows for the interpretation of the significance of each of the parameters investigated. Indeed, the statistical analysis of a rationally designed set of experiments allows for much more data to be obtained from a limited set of experiments than is normally the case with the one variable at a time (OVAT) approach. Principal Component Analysis. The quantitative investigation of how different chemical components affect a given reaction is not trivial. Principal Component Analysis (PCA) is a multivariate data analysis technique that can use basic chemical properties to represent each reaction component, including reagents and solvents, in order to relate discrete chemical variants of each component to one another. Through methods developed by Carlson,23,27 the understanding and use of basic chemical properties allows the development of principal components that enable discrete parameters to be considered in the same manner as continuous parameters. These basic chemical properties could be measurable physical factors (e.g., bp, density, bond length), or calculated and theoretical ones (e.g., electron density, Hansen solubility parameters, Kamlet− Taft solvent polarity parameters). Carlson has used PCA to describe solvents and other chemical classes such as Lewis acids and amines.28 In order to facilitate investigation of catalysed reactions more efficiently using PCA, high quality chemical descriptors (property data) have been developed by others for monodentate phosphine, bidentate phosphine, and carbene ligand classes.29 The use of PCA allows a representative subset of chemicals to be selected to cover the overall set of chemicals for that reaction component. The number of chemical entities that are required in the subset in order for it to be truly representative depends on a number of factors. Typically, three principal components (treated as three continuous parameters) will often explain much of the variation (i.e., >75%). In such cases, the selection of nine discrete chemicals (i.e., settings) is a reasonable number to ensure coverage of the diversity of the chemical space. If more principal components are required to account for >75% of the variation in results, then a larger selection of the discrete parameters will be required to effectively cover the chemical space. This is perhaps more easily visualised in Figure 1, which attempts to represent how a set of discrete parameters (which could be individual chemicals such as ligands, solvents etc., or other discontinuous factors) are related to three principal components in three-dimensional space. In this way they can essentially be mapped to one another. If more principal components are required to build a reasonable model (i.e., >75%), they can be added in, but the result is of course more difficult to visualise. As outlined above, PCA is a data dimension reduction technique. It combines property data for each chemical entity in the set of any reaction component to produce a reduced set of principal components which describe the chemical variation in that reaction component. Accordingly, the use of PCA allows the consideration of a discontinuous variable, such as a range of bases or solvents, in a manner similar to a continuous variable, such as temperature or stoichiometry. PCA necessarily contains no new information, but it does give a measure of how well it describes the variation in results and how much information is explained by the principal components. If significant properties

investigating the seven continuous parameters at just two levels. Testing 500 ligands with just the two Pd catalyst precursors increases the number to 128,000 experiments. Testing each against the limited choice of four bases raises this to 512,000 experiments. Finally, conducting each of these in the 100 commercially available solvents increases the number of potential experiments to 51.2 million. It is important to recognise that this is still a simplification. If four levels were chosen to give more definition to the continuous variables, the number of experiments would increase to a barely comprehensible 6554 million! Despite the enormous numbers cited, this calculation has still required some gross assumptions, and a huge area of chemical reaction space has been automatically excluded. For example, alternative metals to Pd (e.g., Ni, Fe, Cu) have not been considered, and even for Pd, only two precatalysts have been chosen. A wider range of bases could also have been considered encompassing the full suite of alkali metal carbonates and hydroxides, and organic amines. There are also other practical issues that usually need consideration in transition-metalcatalysed reactions which may not be easy to perform and are even harder to predict in their significance. The formation of the metal−ligand complex in particular can be critical, especially for more challenging reactions; and the mode of addition of the organometallic reagent can be important, for example in Suzuki reactions where proto-deboration of electron-rich boronic acids is a potential problem.22 There has been a tendency in some quarters to use high throughput experimentation to combat the numbers in the development of catalysed reactions. However, it will be apparent that even a very large screen would make little impact on the 51.2 million combinations calculated above. Such a random selection approach is flawed and fails if not all combinations are explored because each reaction is essentially independent of its apparent “neighbours”. Desirable permutations may be missed, even though the reaction space appears to be well covered, because it is not possible to relate the results of one experiment to another. Furthermore, attempting to run all these experiments is impractical if only because using just 10 mg of the key starting material in each experiment would require 512 kg in total, not to mention all the other reagents and solvents. This is of course both ridiculous in theory and impossible in practice, due to the vast quantity of time and materials required for such an endeavour. Unless there is some form of structured approach to reduce the number of experiments, then only a very limited and selective investigation of the reaction space can ever be carried out. How then can the discrete parameters be handled in such a way as to reduce this to a manageable number of experiments? Fortunately, a structured approach using statistical Design of Experiments in combination with Principal Component Analysis does exist which allows for the treatment of discrete parameters in a continuous manner. This can be then used to significantly reduce the huge number of possible experiments to something achievable in practice. Design of Experiments. The use of factorial experimental design, or Design of Experiments (DoE) as it is more widely known, has been mentioned in passing above, and is a powerful technique which can be used to explore important parameters in a chemical reaction.23,24 It is well-known to readers of this journal,25 and has been widely used in industry and in many other academic disciplines.26 It facilitates the systematic variation of multiple factors simultaneously in a highly efficient C

dx.doi.org/10.1021/op300275p | Org. Process Res. Dev. XXXX, XXX, XXX−XXX

Organic Process Research & Development

Concept Article

Figure 1. Uncharted chemical space: discrete parameters (e.g., ligands, solvents) mapped against three axes (principal components, PC1, PC2, PC3).

Figure 2. Making sense of the chemical space: rational selection typical of an initial screening design based on an approximate cube (selections in dark blue).

are not explained by the (first three) principal components, a re-analysis of the data against the experimental results may be able to identify the important property(ies) overlooked. Combining DoE and PCA. Experimental designs (i.e., DoE techniques) can now be devised to investigate a specific parameter and its interaction with other parameters at several levels of understanding. For example, although linear models are insufficiently powerful to provide information on interactions, they are very useful in screening situations with a view to determining which of many variables are important. Interaction models include cross-product terms, so that it is possible to assess interaction effects, something that is always advisable to consider. Quadratic models can describe nonlinear behaviour for variables by taking the curvature of the response surface into account. The construction of a quadratic model necessarily requires (many) more experiments than that of a linear model. Therefore, before designing the set of experiments to be conducted in the laboratory, one must decide on how detailed is the information required to be. What Does This Mean in Practice? We have tried and tested PCA-based maps for ligands and solvents.30 This enables a reduction in the numbers of ligands from 500 down to a representative set of nine, and a reduction for solvents from 100 also down to a representative set of nine.31 The choice of nine representatives at the approximate points of a cube (with one centre point) allows good initial coverage of the (in this case, three-dimensional) chemical space being described for that parameter (Figure 2). If a more detailed (multidimensional) model is required, then a wider choice of discrete parameters could then be chosen, focussing in on the area(s) of interest. Applying this PCA technique to our example reaction has a startling effect, and leads to an initial decrease in the number of potential experiments from 51.2 million down to 82,944 (i.e 27 × 2 Pd sources × 9 ligands × 4 bases × 9 solvents), but this still requires 829 g of starting material if all these experiments were carried out on a 10 mg scale. However, by applying one of several possible factorial experimental designs, the planned study can be reduced to the more practical number of just 35 experiments!32 If the numbers of factors investigated can be reduced, then the numbers of experiments could sensibly be reduced still further. It is, of course, unlikely that a design of 35 experiments would generate complete understanding for a reaction; after all,

you cannot get the data from 51 million experiments in just 35. However, it would rapidly allow the identification of the most important parameters for the reaction, crucially indicating the significance of the discrete parameters, and directing further work to the most promising areas. The most important continuous factors and the important regions of ligand and solvent space having been identified, further ligands or solvents can be investigated in more detail in the relevant areas (Figure 3). It is likely that no more than two

Figure 3. Results of an initial screening design (example only). Key: green = good; amber = moderate; red = poor; light blue = untested; dark blue = proposed for further study.

iterations will be required to find the best region. This enables the identification of structurally diverse materials (and alternative reaction conditions) which will give similar reaction outcomes. An optimisation design can then be carried out on the best catalyst, ligand, and solvent, along with the other important parameters. This final design should now be able to reveal all the interaction information about the reaction. The result of this approach should be the rapid identification by rational selection of highly optimised alternative reaction conditions, which may be higher-yielding, more economic, or IP-free, for example. D

dx.doi.org/10.1021/op300275p | Org. Process Res. Dev. XXXX, XXX, XXX−XXX

Organic Process Research & Development

Concept Article

A Worked Example. To aid a better understanding of this process, an instructive example from actual practice may be helpful. Consider the Buchwald−Hartwig sulfamidation of a heteroaromatic chloride 4 with a sulphonamide 5 to give product 6, as shown generically in Scheme 2. Although the

necessarily the case with other techniques for experimental selection, particularly OVAT.



CONCLUSIONS We have presented here an assessment of the types and number of reaction parameters that can affect all reactions, and notably for transition-metal-catalysed reactions. From this we have estimated that there are 51.2 million reaction permutations that might be applicable to the investigation of a typical transitionmetal-catalysed reaction. However, given the relative complexity of many if not all such reactions, we believe this is a representative estimate. We believe this represents for the first time an informed estimate of the number of permutations possible and therefore the enormity of the task in optimising chemical reactions. Furthermore, we argue that a more systematic and experimentally directed approach to optimising such challenging reactions is essential to truly navigate the vastness of chemical reaction space. The powerful combination of DoE and PCA is applicable to all reactions, especially in dealing with discrete (discontinuous) parameters such as solvent, itself a critical and often overlooked variable. Of course, many reactions have many successful combinations which will not be difficult to discover and optimise (as shown by our generic procedure for the Suzuki reaction).33 Instead, this analysis is directed towards that subset of reactions within each group which are more challenging; it provides an estimate of the size of the potential problem and, more importantly, offers a solution to dealing with the large numbers involved. It is not acceptable simply to rely on serendipity to chart chemical space!

Scheme 2. Generic Buchwald−Hartwig sulfamidation reaction

initial reaction conditions gave good conversion in our hands, they used an expensive ligand. The aim of this study was to find another ligand without compromising on beneficial aspects of the reaction. In the initial round of screening, 35 reactions were conducted which investigated a selection of nine ligands and nine solvents, with two Pd sources [Pd(OAc)2 and Pd2dba3] and two metal:ligand ratios (1:1 and 1:2) also investigated. All other parameters were kept constant at this stage. From this screen, two new catalyst systems (i.e., metal:ligand combinations) were identified which gave good conversion. One of these new ligands was half the cost of the ligand initially employed. A preferred solvent was also identified. A second iteration of only 12 reactions in the region of optimum ligand space (cf. Figure 3) identified four more ligands giving good or complete conversion. A further iteration of 12 reactions in this developing “hot spot” identified an additional four ligands that also worked well. Commercial analysis of these alternative ligands identified one that was both free of restrictive IP and cost-effective in practice. An optimisation design of 19 experiments (i.e., a standard design of 16 experiments with three centre point repeats) was then employed to finalise the ranges on the stoichiometries used (e.g., metal and ligand loading, excess of the second starting material), whilst the other parameters were fixed (e.g., choice of metal, ligand, solvent, base, etc.). Overall, these four iterations identified a cheap and IP-free ligand from the many which were available and which gave a comparable reaction profile, metal loading, and substrate stoichiometry to those of the initial reaction conditions. Furthermore, these new conditions were identified and fully optimised in less than 80 test-tube-scale reactions. In summary, the use DoE and PCA techniques allows for the judicious reduction of the 51.2 million experiments described in the example above to an initial 35 experiments, depending on the level of interpretation of the data required. Whilst one cannot expect to get something for nothing, utilising statistical DoE in combination with PCA allows the numbers of experiments to be reduced signif icantly and in a rational manner in order to define a practical and effective experimental procedure. From a planning point of view, typically 2−4 designs should be expected to get a f ully optimised process, depending on the level of optimisation required. After the initial larger design, subsequent designs would typically be smaller (12−24 experiments each), targeted around the reaction space of interest (Figure 3). In this context, it is valuable to remember that each design can be straightforwardly related back to the whole design space, so a coherent set of data is gathered and increasingly focused in the optimum area. This is not



AUTHOR INFORMATION

Corresponding Author

*Telephone: +44 29 2083 7444. E-mail: jonathan.moseley@ catsci.com. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We thank David Hose (AstraZeneca, Macclesfield) for some helpful discussions on this subject.



REFERENCES

(1) Temperature and time are effectively related, since in practice chemists often use them as different aspects of the same parameter (i.e., the energy supplied to the reaction mixture); hence, the success of microwave chemistry which shortens time by increasing temperature. (2) (a) Custers, J. P. A.; Hersmis, M. C.; Meuldijk, J.; Vekemans, J. A. J. M.; Hulshof, L. A. Org. Process Res. Dev. 2002, 6, 645−651. (b) Moseley, J. D.; Bansal, P.; Bowden, S. A.; Couch, A. E. M.; Hubacek, I.; Weingärtner, G. Org. Process Res. Dev. 2006, 10, 153−158. (3) Anderson, N. G. Practical Considerations for Scale-Up. In Practical Process Research and Development: A Guide For Organic Chemists, 2nd ed.; Academic Press: New York, 2012. (4) Reichardt, C.; Welton, T. Solvents and Solvent Effects in Organic Chemistry, 4th ed.; Wiley-VCH: Weinheim, 2011. (5) Anderson, N. G. Solvent Selection. In Practical Process Research and Development: A Guide For Organic Chemists, 2nd ed.; Academic Press: New York, 2012. (6) Lathbury, D. Org. Process Res. Dev. 2007, 11, 104 and references therein. E

dx.doi.org/10.1021/op300275p | Org. Process Res. Dev. XXXX, XXX, XXX−XXX

Organic Process Research & Development

Concept Article

Microwave Synthesis: Chemistry at the Speed of Light; CEM Publishing: Mathews, NC, 2002. (19) For example, Strem offers well over 500 trivalent phosphorus compounds that could be considered as potential ligands. Other suppliers offer fewer, but still in the range of hundreds. (20) For example, a SciFinder search (August 2012) of trivalent phosphorus compounds (i.e., potential ligands) with R1/R2 = C, H, N, O and R3 = C, H, N, O, Cl, F found over 457,000 unique CAS numbers. Of these, only a few hundred are specific coordination complexes with platinum group metals, which indicates that many potential catalyst species are still unknown. (21) For example, Reichardt (ref 18a) lists a “compilation of [one] hundred important organic solvents and their physical constants” in his monograph, not counting a further 27 chiral solvents. About half of these are available on bulk manufacturing scale, and the rest are readily available at research scales if not at intermediate (pilot plant) scales. Anderson (ref 18b) lists 34 specific “solvents useful for scale-up”. (22) (a) Lennox, A. J. J.; Lloyd-Jones, G. C. Isr. J. Chem. 2010, 50, 664−674. (b) Miyaura, N. Top. Curr. Chem. 2002, 219, 11−59. (23) Carlson, R.; Carlson, J. E. Design and Optimization in Organic Synthesis, 2nd ed.; Data Handling in Science and Technology; Elsevier: Amsterdam, 2005; Vol. 24. (24) (a) Owen, M. R.; Luscombe, C.; Lai, L.-W.; Godbert, S.; Crookes, D. L.; Emiabata-Smith, D. Org. Process Res. Dev. 2001, 5, 308−323. (b) Lendrem, D.; Owen, M.; Godbert, S. Org. Process Res. Dev. 2001, 5, 324−327. (c) Roberge, D. M. Org. Process Res. Dev. 2004, 8, 1049−1053. (d) Gozálvez, J. M.; García-Díaz, J. C. J. Chem. Educ. 2006, 83, 647−650. (25) In its relatively short life, there have been 42 articles in this journal (Org. Process Res. Dev.) specifically mentioning “experimental design” or “design of experiment(s)” in the abstract, not counting the many more in which experimental design techniques have been used as a standard tool without special note. (26) A SciFinder search (August 2012) identified 960 references for the concept “Design of Experiments”. A small sample of citations is given to exemplify the range of disciplines and industries that use DoE (the subject area is noted where not obvious from the journal title). (a) Subra, P.; Jestin, P. Ind. Eng. Chem. Res. 2000, 39, 4178−4184 (solvents). (b) Martínez, B.; Rincón, F.; Ibáñez, M. V. J. Agric. Food Chem. 2000, 48, 2097−2100. (c) Gooding, O. W.; Vo, L.; Bhattacharyya, S.; Labadie, J. W. J. Comb. Chem. 2002, 4, 576−583. (d) Gooding, O. W. Curr. Opin. Chem. Biol. 2004, 8, 297−304. (e) Muthukumar, M.; Sargunamani, D.; Selvakumar, N.; Rao, J. V. Dyes Pigm. 2004, 63, 127−134. (f) Sjövall, S.; Hansen, L.; Granquist, B. Org. Process Res. Dev. 2004, 8, 802−807 (formulation). (g) Guo, Y.; Srinivasan, S.; Gaiki, S. Chromatographia 2007, 66, 223−229. (h) Prakasham, R. S.; Rao, C. S; Rao, R. S.; Lakshmi, G. S.; Sarma, P. N. J. Appl. Microbiol. 2007, 102, 1382−1391. (i) Oberg, A. L.; Vitek, O. J. Proteome Res. 2009, 8, 2144−2156 (mass spectrometry). (j) Gruendling, T.; Guilhaus, M.; Barner-Kowollik, C. Macromol. Rapid Commun. 2009, 30, 589−597. (k) Liu, B.; Zhang, Y. Environ. Sci. Technol. 2011, 45, 3504−3510 (geochemistry). (l) Casciato, M. J.; Kim, S.; Lu, J. C.; Hess, D. W.; Grover, M. A. Ind. Eng. Chem. Res. 2012, 51, 4363−4370 (nanomaterials). (27) Carlson, R.; Carlson, J. E. Org. Process Res. Dev. 2005, 9, 680− 689. (28) Solvents: (a) Carlson, R.; Lundstedt, T.; Albano, C. Acta Chem. Scand., Ser. B 1985, 39, 79−91. Lewis acids: (b) Carlson, R.; Lundstedt, T.; Nordahl, Å.; Prochazka, M. Acta Chem. Scand., Ser. B 1986, 40, 522−533. Amines: (c) Carlson, R.; Prochazka, M. P.; Lundstedt, T. Acta Chem. Scand., Ser. B 1988, 42, 157−165. (29) Monodentate: (a) Jover, J.; Fey, N.; Harvey, J. N.; Lloyd-Jones, G. C.; Orpen, A. G.; Owen-Smith, G. J. J.; Murray, P.; Hose, D. R. J.; Osborne, R.; Purdie, M. Organometallics 2010, 29, 6245−6258. Bidentate: (b) Jover, J.; Fey, N.; Harvey, J. N.; Lloyd-Jones, G. C.; Orpen, A. G.; Owen-Smith, G. J. J.; Murray, P.; Hose, D. R. J.; Osborne, R.; Purdie, M. Organometallics 2012, 31, 5302−5306. Carbene: (c) Fey, N.; Haddow, M. F.; Harvey, J. N.; McMullin, C. L.; Orpen, A. G. Dalton Trans. 2009, 8183−8196.

(7) Although at small scale the agitation method and vessel geometry are rarely important (although see the following for a rare example at small scale: Lennox, A. J. J.; Lloyd-Jones, G. C. J. Am. Chem. Soc. 2012, 134, 7431−7441 ), such issues often become significant on scale-up as discussed in ref 3. (8) It is worth noting that the solvent boiling point may not be limiting when using autoclave or sealed vessel microwave reactors. Indeed, at least one experimental design study predicted that better results could be achieved above the solvent boiling point, for which the authors then used microwave technology; see: Gopalsamy, A.; Shi, M.; Nilakantan, R. Org. Process Res. Dev. 2007, 11, 450−454. (9) (a) Corey, E. J.; Cheng, X.-M. The Logic of Chemical Synthesis; John Wiley and Sons: New York, 1995. (b) Nicolaou, K. C.; Sorensen, E. J. Classics in Total Synthesis; VCH, Weinheim, 1996. (10) (a) Blaser, H. U., Schmidt, E., Eds. Asymmetric Catalysis on Industrial Scale; Wiley-VCH: Weinheim, 2004. (b) Blaser, H. U., Federsel, H.-J., Eds. Asymmetric Catalysis on Industrial Scale, 2nd ed.; Wiley-VCH: Weinheim, 2010. (c) Slagt, V. F.; de Vries, A. H. M.; de Vries, J. G.; Kellogg, R. M. Org. Process Res. Dev. 2010, 14, 30−47. (d) Magano, J.; Dunetz, J. R. Chem. Rev. 2011, 111, 2177−2250. (11) These could be acids, bases, cosolvents, inhibitors, Lewis acids/ bases, ligands, phase transfer catalysts, promoters, radical initiators, reoxidants, sponges, zeolites/clays, etc. Recent examples of additives modifying catalytic reactions include: (a) Lewis acid cooperation in Ni catalysis: Nakao, Y.; Yamada, Y.; Kashihara, N.; Hiyama, T. J. Am. Chem. Soc. 2010, 132, 13666−13668. (b) Enhancement of enantioselectivity by alcohols in an asymmetric hydrogenation reaction: Ito, J.-I.; Teshima, T.; Nishiyama, H. Chem. Commun. 2012, 48, 1105−1107. (c) Use of Zn in a Hiyama reaction: Minami, Y.; Shiraishi, Y.; Yamada, K.; Hiyama, T. J. Am. Chem. Soc. 2012, 134, 6124−6127. (12) For further practical issues, see Anderson, N. G. Optimizing Catalytic Reactions. In Practical Process Research and Development: A Guide For Organic Chemists, 2nd ed.; Academic Press: New York, 2012. (13) Rothenberg has also estimated large numbers on the basis of a library of “virtual catalysts” derived from computational techniques. An initial selection of catalysts is screened, and the results are treated in a QSAR fashion for following iterations. This approach does use PCA for the ligands and solvents, but necessarily uses fewer descriptors to ease the computational demands. Catalysts are prepared in a combinatorial approach and are not necessarily isolated (or even individual) species. In our approach, we have sought to use commercially available ligands and solvents to calculate our numbers, and there is a greater emphasis on combining optimisation of the continuous (physical) parameters with the discrete (chemical) ones to optimise the overall reaction. Rothenberg’s approach has more focus on the catalyst/ligand combination from a theoretical perspective, whereas ours is more pragmatic for all reaction aspects. However, both approaches have much in common, and emphasise that rational selection must take place to cope with the large number of reactions possible. See: (a) Maldonado, A. G.; Rothenberg, G. Chem. Soc. Rev. 2010, 39, 1891−1902. (b) Burello, E.; Rotheberg, G. Int. J. Mol. Sci. 2006, 7, 375−404. (c) Burello, E.; Rothenberg, G. Adv. Synth. Catal. 2005, 347, 1969−1977. (d) Burello, E.; Farrusseng, D.; Rothenberg, G. Adv. Synth. Catal. 2004, 346, 1844−1853. (e) Burello, E.; Rothenberg, G. Adv. Synth. Catal. 2003, 345, 1334−1340. (14) Suzuki, A. Angew. Chem., Int. Ed. 2011, 50, 6722−6737. (15) The Suzuki reaction can of course be used to couple other halides/pseudohalides and boronic acids/esters to yield other structural motifs (see ref 14). (16) Although not always! See, for example: Denmark, S. E.; Butler, C. R. J. Am. Chem. Soc. 2008, 130, 3690−3704. (17) (a) Tolman, C. A. Chem. Rev. 1977, 77, 313−48. (b) Bunten, K. A.; Chen, L.; Fernandez, A. L.; Poe, A. J. Coord. Chem. Rev. 2002, 233−234, 41−51. (c) Freixa, Z.; van Leeuwen, P. W. N. M. Dalton Trans. 2003, 1890−1901. (18) (a) Table A-1, pp 408−413 in Reichardt (ref 4.). (b) Table 4.3, pp 86−87 in Anderson (ref 5.). (c) Table 1, p. 35 in Hayes, B. L. F

dx.doi.org/10.1021/op300275p | Org. Process Res. Dev. XXXX, XXX, XXX−XXX

Organic Process Research & Development

Concept Article

(30) CatScI has PCA models for a variety of monodentate and bidentate ligands, solvents, Lewis acids, and amine bases. Solvent, Lewis acid, and amine maps have been published by others, e.g. Carlson (ref 28). By assembling and modelling principal components values, models could be constructed for other classes of reagent or discrete parameters. (31) The number of solvents included may seem large, but it is worth pointing out that, on commercial projects, the investigation of solvents has led to very successful results. These benefits would not have been discovered without covering a wide range of solvents and the application of these techniques to make a rational and focussed choice. (32) This is based on a fractional factorial design of 15 factors at two levels: 7 continuous parameters, 3 parameters each for the ligands and the solvents, and one each for the base and the metal precatalyst. A Resolution IV fractional factorial design can provide some information on all 15 factors in 35 experiments (32 reactions with 3 repeats for reproducibility). The full Resolution IV design provides all information on the main parameters, but interactions will be confounded. For greater information, a Resolution V design would require 259 experiments (256 + 3) and will provide information on all parameters and all two-level interactions free from confounding. In our experience, a Resolution IV design provides more than enough information to identify the important aspects of the reaction. Through further experimentation, any confounded interactions can then be investigated and understood. See ref 23. (33) Moseley, J. D.; Murray, P. M.; Turp, E. R.; Tyler, S. N. G.; Burn, R. T. Tetrahedron 2012, 68, 6010−6017.

G

dx.doi.org/10.1021/op300275p | Org. Process Res. Dev. XXXX, XXX, XXX−XXX