A general method for the identification of crystal ... - ACS Publications

small but detectable differences in Raman spectra of crystal faces caused by different ... The analytical method used to uniequivocally determine the ...
2 downloads 0 Views 2MB Size
Subscriber access provided by University of South Dakota

New Concepts at the Interface: Novel Viewpoints and Interpretations, Theory and Computations

A general method for the identification of crystal faces using Raman spectroscopy combined with machine learning and application to the epitaxial growth of acetaminophen Tharanga K. Wijethunga, Jelena Stojakovic, Michael A. Bellucci, Xingyu Chen, Allan S. Myerson, and Bernhardt L Trout Langmuir, Just Accepted Manuscript • DOI: 10.1021/acs.langmuir.8b01791 • Publication Date (Web): 27 Jul 2018 Downloaded from http://pubs.acs.org on July 31, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

A general method for the identification of crystal faces using Raman spectroscopy combined with machine learning and application to the epitaxial growth of acetaminophen Tharanga K. Wijethunga,# Jelena Stojaković,# Michael A. Bellucci,# Xingyu Chen, Allan S. Myerson and Bernhardt L. Trout* Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States. ABSTRACT

Crystal morphology is one of the key crystallographic characteristics that governs the macroscopic properties of crystalline materials. The identification of crystal faces, or face indexing, is an important technique that is used to get information regarding a crystal’s morphology. However, it is mainly limited to single crystal X-ray diffraction (SCXRD) and it is often not applicable to products of routine crystallizations since it requires high quality single crystals in a narrow size range. To overcome the limitations of the SCXRD method, we have developed a robust and convenient Raman face indexing method based on work by Moriyama et al. This method exploits small but detectable differences in Raman spectra of crystal faces caused by different orientations

ACS Paragon Plus Environment

1

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 36

of the crystallographic axis relative to the direction and polarization of the excitation laser beam. The method requires the compilation of a Raman spectral library for each compound and must be built and validated by SCXRD face indexing. Once the spectral library is available for a compound, the identity of unknown crystal faces (from any crystal that is larger than laser beam) can be inferred by collecting and comparing the Raman spectra to spectra within the library. We have optimized this approach further by developing a machine-learning algorithm that identifies crystal faces by performing a statistical comparison of the spectra in the Raman library and the Raman spectra of the unknown crystal faces. Here, we report the development of the Raman face indexing method and apply it to three different epitaxial systems: Acetaminophen (APAP) grown as an overlayer crystal on D-mannitol (MAN), D-galactose (GAL), and xylitol (XYL) substrates. For each of these epitaxial systems, the crystals were grown under various experimental conditions and have a wide range of sizes and quality. Using the Raman face indexing method, we were able to perform high-throughput indexing of a large number of crystals from different crystallization conditions, which could not be achieved using SCXRD or other analytical techniques.

INTRODUCTION The identity and surface area ratio of crystal faces, i.e. crystal morphology, is one of the key parameters that needs to be controlled in crystallization since it governs the properties of the bulk material, such as solubility, dissolution profile, flow and compaction.1–4 Crystal faces are composed of ordered two-dimensional arrays of molecules or atoms, in which specific functional groups are exposed to the surrounding environment. These exposed functional groups directly affect the chemical and physical properties of the crystalline material. For example, materials of the same chemical composition can have different wettability if a hydrophilic or hydrophobic

ACS Paragon Plus Environment

2

Page 3 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

crystal face is the dominant one.5 Most industrial crystallizations attempt to control crystal morphology by optimizing supersaturation, temperature, additives, solvents, etc.6–8 However, the understanding and control of crystal morphology is still limited partially due to difficulties with identifying crystal faces for most crystallizations. The analytical method used to uniequivocally determine the crystal morphology is face indexing by single crystal X-ray diffraction (SCXRD). However, there are serious limitations to this method. To begin with, the technique requires single crystals of very high quality and limited size distribution (ideally 0.3 – 0.5 mm). Crystallization of such crystals is often hard to achieve, especially under industrial crystallization conditions because often the size distribution suitable for SCXRD does not fit the targeted product specifications. The SCXRD method also relies on visual analysis and indexing of crystal faces, which is impractical or even impossible when it comes to complex crystalline systems such as broken, agglomerated crystals commonly observed during production under realistic conditions. Apart from SCXRD, there are few alternatives to crystal face indexing. Computer programs developed for morphology prediction based on single crystal structures can be used to index the crystal faces if the morphology is calculated accurately.9– 11

Powder X-ray diffraction (PXRD),12,13 optical goniometry14,15 and scanning electron microscopy

(SEM)13,16 have been used for face indexing of single crystal systems but neither method can be extended to more complex multi-crystalline systems and both are complicated and have technical limitations. For example, in the PXRD method, Rietveld refinement,17,18 Le Bail fit,19,20 or a comparison of angles and faces with a known isotypic compound21 is required. On the other hand, optical goniometry requires one to obtain interfacial angles from crystals14,15 and the SEM method requires to extract angular measurements from photographs.13,16,22 Analysis of crystal morphology is also possible using Raman spectroscopy.23–27 Moriyama et al. have used Raman microscopy to

ACS Paragon Plus Environment

3

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 36

determine the crystal habit of the organic crystal phenytoin.25 However, to the best of our knowledge, in all reported examples, Raman spectroscopy was only used to characterize wellgrown single crystals and the possibility of using Raman spectroscopy as an alternative method for face indexing in complex multi-crystalline systems has not been investigated thus far. Raman scattering can be used in studies of crystals because the Raman polarizability tensor is dependent on the crystal symmetry, the orientation of the sample relative to the laser beam direction, and the polarization of the incident and collected light.28 As a result, depending on which crystal face is used to collect the spectra, there is a change in the intensity of some peaks in the Raman spectra.29 In principle, different crystal faces can be distinguished with Raman spectroscopy by detecting changes in peak intensities. Compared to previously discussed methods like SCXRD, PXRD and SEM face indexing, Raman spectroscopy is extremely fast and simple to use and allows one to analyze a number of samples in minutes. More importantly, there is practically no limitation on the size of crystals that can be analyzed. Crystals of quite small sizes, i.e. submicron and perhaps as low as 200 nm can also be analyzed, since it is only limited by the resolution of the Raman laser beam. Furthermore, crystals with various qualities (i.e. broken, agglomerated or conjoint) can be analyzed with Raman spectroscopy. Clear spectral signals obtained with Raman spectroscopy can also be applied to differentiate the chemical and physical properties, and high-speed/high-resolution chemical imaging can be performed with Raman microscopy. Given these facts, the Raman spectroscopic method can be used in both simple systems such as single crystals as well as complicated crystalline systems, such as epitaxial crystals. Epitaxial growth is one of the techniques used to control the crystal morphology, orientation, and other properties (e.g. nucleation, crystal growth, and polymorphism) of crystalline

ACS Paragon Plus Environment

4

Page 5 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

materials.30,31 The technique relies on crystalline heterosurfaces, commonly known as substrates, to direct and control nucleation and growth of the newly forming crystal, known as the overlayer.32 However, the mechanism of heterogeneous nucleation via epitaxial growth is not fully understood and this restricts the direct use and rational selection of crystalline substrates. Currently, mechanisms reported in the literature suggest that the nucleation process and molecular ordering at the overlayer/substrate interface is controlled by either the lattice match between the two contacting faces33–35 or the intermolecular interactions between these faces36,37. In order to understand the possible mechanism and to select rationally proper substrates for epitaxial growth, it is necessary to know which faces of the substrate and overlayer are in contact. Currently, SCXRD is the only available method to identify the faces of epitaxial crystals. Apart from the abovementioned inherent issues with SCXRD, there are additional complications when face indexing epitaxial systems with SCXRD. To name few, two unique crystal cells (substrate and overlayer) have to be centered, and two unique data sets have to be collected and processed and the visual analysis of crystal faces is complicated as these faces are partially hindered because the crystals are joined by at least one face. In this report, we present the development of a fast, simple and effective method based on Raman spectroscopy to identify the interacting crystal faces of each compound in epitaxial crystal systems. This general method can be applied to any crystalline system, ranging from simple and isolated single crystals to challenging systems such broken, agglomerated or epitaxial crystals of practically any size and quality. The method requires a library of Raman spectra for each of the possible crystal faces and for each compound of interest. For epitaxial systems, these include both the substrate and the overlayer crystals. We built libraries using SCXRD face indexing and Raman spectroscopy of isolated single crystals. Once a library is built for a given polymorph, Raman face

ACS Paragon Plus Environment

5

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 36

indexing can be applied to identify any unknown face of the crystal systems involving these compounds by first collecting Raman spectra of the crystal face of interest and then finding the matching spectra from the library. In addition to the manual method for face identification, i.e. comparing the Raman spectra of unknown crystal faces to known spectra within the Raman library, we have developed an algorithm that automate this process for face identification. The manual method of comparison is sufficient for face identification but can be tedious to apply and has much lower throughput than an automated method. Thus, using an algorithm to compare unknown spectra to the spectra in the Raman library through a statistical analysis is ideal to get an objective comparison. In doing so, one can limit any false identification of crystal faces and demonstrate that with a simple statistical analysis, face indexing with Raman spectroscopy can be made objective and accurate. At its core, the method we propose for face indexing with a Raman spectral library is an example of a classification problem, which can easily be addressed with machine learning algorithms. In fact, a number of machine learning algorithms, such as decision trees38, Naïve Bayes39, artificial neural networks40, and support vector machines41 have been applied for spectral analysis. However, the most commonly applied algorithms for spectral analysis are the Soft Independent Modeling of Class Analogy42–44 (SIMCA) and Partial Least Squares methods.45–47 While these methods have been applied for a range of spectral analysis problems42–47, they often require larger datasets and a significant level of expertise to be effective. Furthermore, it is unclear how they would perform for face indexing since the Raman spectra for each crystal face differ only in the peak intensities of some of the peaks, and not the location of these peaks. In particular, since the SIMCA method allows for unknown data to be classified as belonging to multiple classes, the algorithm may identify an unknown crystal face as multiple crystal faces depending on whether it can successfully

ACS Paragon Plus Environment

6

Page 7 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

resolve subtle differences in the peak intensities in the Raman spectral library. For these reasons, we have decided to develop our own spectral analysis algorithm that is both simple and effective. The epitaxial systems used in our study had acetaminophen (APAP) as the overlayer and were grown using β-D-mannitol (MAN), D-galactose (GAL) and xylitol (XYL) as the substrates (see Figure 1). These systems were selected because they form crystals under various conditions and are pharmaceutically important compounds. In a previous study,37 we showed that APAP is preferentially growing epitaxially on some of the faces of these substrates; however, the identification of those faces of the crystals obtained under actual conditions were not possible due to the limitations of SCXRD method. Thus, these systems were suitable candidates for the exploration of the Raman face-indexing method described in the present study. To build the library, we collected Raman spectra on the six unique faces of APAP and four unique faces of each substrate, MAN, GAL and XYL on single crystals of each compound and then used same crystals for SCXRD face indexing. The library was validated on an additional set of crystals. Finally, we applied Raman face indexing to epitaxial crystals obtained under different crystallization conditions, including those that do not produce crystals suitable for SCXRD.

Figure 1. Chemical structures of a) Acetaminophen (APAP) and the substrates b) β-D-mannitol (MAN) c) D-galactose (GAL) and d) xylitol (XYL)

ACS Paragon Plus Environment

7

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 36

MATERIALS AND METHODS Materials APAP, MAN, and XYL were purchased from Sigma-Aldrich (St. Louis, MO). GAL was purchased from Acros Organics (Geel, Belgium). Ethanol (absolute, 200 proof) was purchased from VWR International (Edison, NJ). One-milliliter shell vials with clear caps were obtained from Cole Parmer Instrument Co. (Vernon Hills, IL). Growing single crystals of APAP and substrates Thin pad-shaped single crystals of GAL were obtained by slowly cooling a 70% ethanol solution of GAL. Long needle-shaped MAN crystals were obtained by slowly evaporating an aqueous solution over a week.32 XYL crystals were obtained from a saturated solution XYL in 95% ethanol by slow evaporation at room temperature.48 Three morphologies of APAP form I single crystals were obtained with three different supersaturations in water.49 Three solutions of APAP in water (5% w/w, 15% w/w and 30% w/w) were prepared. All three solutions were heated to about 60 ºC on a heating plate to ensure the complete dissolution of all APAP. Once they were completely dissolved, the heating was stopped and the vials were kept on the heating plate to be cooled down to the room temperature. These vials were kept undisturbed for few days until the crystals of desired morphologies were formed. The Supporting Information (SI S1) shows the microscopic images of APAP crystals obtained under each supersaturation. Once single crystals were obtained with APAP and substrates, good quality and good-sized single crystals were picked by observing under a Nikon Eclipse ME600 optical microscope equipped with a polarizer and indexed with SCXRD and then the same crystals were used to build the library of Raman spectra.

ACS Paragon Plus Environment

8

Page 9 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

Growing epitaxial crystals for face indexing To grow epitaxial crystals of APAP on selected substrates, some of the good quality single crystals of the substrates were selected and placed in 20 ml scintillation vials. Then, a saturated APAP solution in ethanol (3 mL) at 30 °C was carefully pipetted into these vials, and single crystals of APAP were grown on to the substrate single crystals by allowing the solutions to cool slowly to room temperature. All crystallizations were performed without stirring. Once epitaxial APAP crystals were observed to have formed in the vials, the solution was filtered, and the crystal pairs were recovered. These crystals were then analyzed under an optical microscope to identify the presence of APAP crystals bound with substrate crystals. Among these crystal pairs obtained, good quality (both crystals in the pair had well-defined faces and were attached to each other by a single face only) and good-sized (both crystals in a pair were > 0.5mm in size to occupy the entire X-ray beam) pairs were analyzed with SCXRD face indexing and subsequently used to validate the Raman spectral library. The crystals pairs that were not good in quality to be analyzed with SCXRD were saved as target crystals for Raman analysis. SCXRD face indexing For single crystals of APAP and substrates, at least three individual crystals were analyzed with SCXRD. For the epitaxy crystal systems, at least one pair of crystals with each substrate was analyzed with SCXRD. The unit cell dimensions were determined for single crystals and each counterpart of an epitaxial crystal pair, and the Miller indices of the crystal faces were identified. For each crystal or a crystal pair, low-temperature diffraction data were collected on a BrukerAXS X8 Kappa diffractometer coupled to a Bruker APEX2 CCD detector using Mo Kα radiation (lambda = 0.71073 Å) from an IμS microsource. Omega-scans were performed by focusing the Xray beam on the single crystals, and for the epitaxial crystals the X-ray beam was focused on one

ACS Paragon Plus Environment

9

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 36

component at a time and two different unit cell data were collected. The orientation matrices and unit cell parameters for each component were determined with the program cell_now (Bruker AXS, Inc.), and the crystal faces were determined using the face-indexing plug-in of APEX2 (Bruker AXS, Inc.). Obtaining Epitaxy crystal pairs with induction time experiments Induction experiments of APAP in the presence of selected substrates were performed as we previously reported

37

to obtain epitaxial crystals that are produced under real experimental

conditions. For this, the induction time experiments were conducted with a targeted supersaturation of 1.7 at 15 ºC in the presence of 2 mg (±0.1 mg) of substrate crystals. The substrate crystals were used as they were received and sieved through two sieves of 125 μm and 250 μm to control the crystal size distribution. Usually, the epitaxial crystals obtained under these conditions are not suitable to analyze with SCXRD, as there are commonly observed problems such as multiple APAP crystal formation on the same crystal of the substrate or APAP crystals growing much larger than the substrate crystals. Furthermore, the quality of the substrate crystals used in the induction experiments is not carefully controlled. Raman indexing seems to be a robust method to determine the epitaxial faces of these crystals obtained under induction time experimental conditions. For these experiments, once the formation of large enough APAP crystals on the substrates was observed, these crystals were carefully picked up, dried, and saved for analysis with Raman indexing. Collection of Raman spectra and manual identification of unknown faces A Kaiser Raman microscope equipped with a 785 nm excitation laser and a 20× microscope objective was used with an exposure time of 1 s for each measurement. The microscope was used in the general purpose setting where the excitation laser was not polarized to a certain direction.

ACS Paragon Plus Environment

10

Page 11 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

The Raman spectroscope was manually focused to locate the surface of the crystal, and spectra were collected. In order to build the library of Raman spectra, the single crystals of each compound were glued onto a glass slide to keep the desired face of the crystal orthogonal to the direction of the incident beam. At least, three different Raman spectra were collected on one face of a crystal by focusing the incident beam to different locations on the same face. Once the data collection of one of the faces was done, the crystal was carefully rotated and glued to the glass slide in order for another face to be exposed to the incident beam. Likewise, Raman spectra of all the possible faces of a single crystal were collected and at least three single crystals were analyzed this way for each compound. All these collected spectra were then analyzed to identify the reliable differences and to find the unique intensity pattern of Raman spectra for each face family of a given compound. The analyzed spectra were then compiled as a library. To validate the compiled Raman library, epitaxial crystal pairs that were indexed with SCXRD were used. For this, the incident beam was focused onto the attached face of each compound and at least three Raman spectra were collected. These collected spectra were first compared with the compiled Raman spectral library to identify the face. Secondly, the deduced face was confirmed by comparing with SCXRD results. Finally, for the application of the developed Raman indexing method, two different crystal systems were used: epitaxial crystals pairs that were grown for the purpose of SCXRD face indexing but were not suitable for SCXRD and the epitaxial crystal pairs obtained under induction time experiments. For these crystals, at least three Raman spectra were collected by focusing the incident beam onto the attached faces of each crystal pair. In most cases with the epitaxial crystals, the two crystals had to be separated to collect Raman spectra of the APAP, as the interacting face of APAP was fully attached to the respective substrate face. Once the Raman spectra were

ACS Paragon Plus Environment

11

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 36

collected for unknown faces, the faces were identified manually by comparing with the Raman library. For this, first, the three collected Raman spectra were compared to each other to make sure that they represent the same face. Next, one of these spectra were carefully analyzed in the region of 400-1200 cm-1 and compared to the Raman library of the respective compound to deduce the best matching intensity patterns and the respective face family. Please see the Supplementary Information (SI S2) for the explanation of different terms referring different types of crystals used in this study. Development of the algorithm In developing the algorithm, the first and most obvious objective is that the algorithm must be able to distinguish between the Raman spectra associated with each of the different crystal faces for a given compound under consideration. To this end, we first removed the baseline in each of the Raman spectra, 𝑓(𝜔), and normalized them such that ∑𝑖=1 𝑓((𝜔𝑖 ) = 1. In essence, we convert each Raman spectra into a probability distribution, and we denote these distributions as 𝑝(𝜔). A natural way of comparing probability distributions is provided by information theory and amounts to computing the relative entropy, also known as the Kullback-Leibler divergence50 (KLD), from the probability distribution 𝑞 to the distribution 𝑝:

𝐾𝐿(𝑝||𝑞) = ∫ 𝑝(𝜔)𝑙𝑛 (

𝑝(𝜔) 𝑞(𝜔)

) 𝑑𝜔.

(1)

The KLD has the useful properties that it is always non-negative and is zero if and only if p(ω)=q(ω), for all values of ω. In fact, it is often referred to as the distance between two probability distributions, even though, being non-symmetric, it is not strictly a distance metric. Consequently, by using the KLD we can effectively compute the distance between an unknown Raman spectrum and each of the spectra in the Raman library associated with the crystal faces, and in turn, the KLD

ACS Paragon Plus Environment

12

Page 13 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

should be a minimum for the spectra in the Raman library that matches the unknown spectra the best. Using the KLD in this manner is an effective way of classifying an unknown spectrum, but an algorithm that uses this information alone may still be prone to false identification of crystal faces due to noise in the signal due to instrumentation, external interference, or inaccuracies in the recording process. To account for this intrinsic variation in the spectra for each of the crystal faces, we collect a number of Raman spectra for each indexed crystal face. Given a set of 𝑁 Raman spectra for a crystal face in the library, we construct a model of the ideal spectrum for this face by averaging each of the 𝑁 spectra. Subsequently, we compare each of the 𝑁 spectra to this ideal spectrum using the KLD.

Since each of the 𝑁 spectra are independent samples, the set

{𝐾𝐿1 , 𝐾𝐿2 , … , 𝐾𝐿𝑁 } should be Gaussian distributed with confidence interval, 𝜎 𝑁

〈𝐾𝐿〉 ± 𝑧(√ ) ,

(2)

where 〈∙〉 represents an average, 𝜎 is the standard deviation, and 𝑧 is a scale factor related to the confidence level. Conceptually, the width of this confidence interval represents the intrinsic variation one would expect at a specified confidence level when one compares a sample Raman spectrum associated with the given crystal face to its ideal spectrum using the KLD. Moreover, this confidence interval can be used in statistical hypothesis testing to determine if an unknown Raman spectrum is significantly different from the Raman spectra in the library for each of the faces. In this case, failing to reject the null hypothesis for a given crystal face in the Raman library would correspond to identifying the unknown face. Therefore, our algorithm works as follows: given an unknown Raman spectrum (or possibly set of 𝑀 unknown Raman spectra sampled from the same unknown crystal face), we compute the KLD between the spectrum and each of the ideal model spectra associated with each crystal face in the Raman library, we then perform a hypothesis

ACS Paragon Plus Environment

13

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 36

test, and if we fail to reject the null hypothesis for a given crystal face, then the unknown crystal face is identified as that face. RESULTS AND DISCUSSION Building and verification of the library of Raman spectra To build the library of Raman spectra, at least three single crystals of each compounds were indexed using SCXRD (see Figure 2 and SI S3). For all SCXRD indexed crystals, unit cell dimensions were in good agreement with the literature reported structures.48,51–53 The Raman spectra was collected for all major crystal faces which were identified by analyzing all the observed faces from SCXRD (SI S4). We first collected the Raman spectra of at least three different locations on the same crystal face (Figure 3a) and then on the symmetry related crystal face (Figure 3b). For example, crystal faces (001) and (00-1) of APAP are symmetry related and equivalent and Raman spectra of these two faces have to be identical. To verify that indexing is correct, we then collected the Raman spectra on few other crystals (Figure 3b). Repeating this process for all unique faces allowed us to build a library with high confidence in accuracy of face indexing which is often challenging a prone to errors.

Figure 2. Miller indices of crystal faces determined by SCXRD for (a) APAP crystal and (b) GAL crystal.

ACS Paragon Plus Environment

14

Page 15 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

Figure 3. Raman spectral analysis for APAP face (001); a) three different locations on the same face of the same crystal and b) comparison of 3a face (blue) to an equivalent face (00-1) face of the same crystal and (001) face of different APAP crystal. Since the intensity of peaks in the Raman spectra depends on the orientation of the crystal, we observed changes in the Raman spectra as we exposed each crystal face to the laser beam. We found that the region between 400 and 1200 cm-1 features the most prominent changes in the peak

ACS Paragon Plus Environment

15

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 36

intensity or the peak pattern. The intensities were analyzed comparatively by overlapping the spectra and by comparing the peak heights relative to each other. For example, six different face families for APAP were differentiated (Figure 4). When we carefully analyzed the region between 400 and 1200 cm-1, the six indexed faces show six different intensity patterns. For example, the (001) and (101) faces have only two clearly visible peaks in the region of 600-650 cm-1. However, the peak at 859 cm-1 is more intense than the peak at 797 cm-1 for (001) face while this order is reversed for (101) face. Likewise, a thorough analysis in the region of 400-1200 cm-1 provided some unique identifiers that can be utilized to identify a given face family of APAP. A similar analysis was conducted with MAN, GAL, and XYL, Figure 5 and SI S5. Table 1 summarizes the crystal faces identified using Raman spectroscopy for each type of compound, and Table 2 summarizes the unique identifiers for differentiated face families of each compound.

Figure 4. Comparison of Raman spectra in the region of 400-1200 cm-1 for each distinguished face of APAP.

ACS Paragon Plus Environment

16

Page 17 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

Table 1. List of crystal face families identified using Raman spectroscopy for each type of crystals.

Compound

Number of face families identified

Raman spectra collected on faces:

APAP

6

{001}, {011}, {101}, {-101}, {110}, {-111}

MAN

4

{010}, {110}, {100}, {00-1}

GAL

4

{0-10}, {001}, {101}, {011}

XYL

4

{0-10}, {01-1}, {-100}, {001}

Table 2. Unique distinguishable identifiers for each face family for each compound. Face family

{001} {011} {101} {-101} {110} {-111}

{010} {110} {100}

Selected peaks in the range of 400-1200 cm-1 wave numbers and their distinguishable differences APAP 465 cm-1 and 504 628 cm-1 and 652 797 cm-1 and 859 -1 604 cm cm-1 cm-1 cm-1 465 is more 859 is more Clearly visible 628 absent intense intense 652 is more 859 is more Equal in intensity Slightly visible intense intense 465 is more 797 is more Clearly visible 628 absent intense intense 504 is more 628 is more 859 is more Clearly visible intense intense intense 504 is more 652 is more 859 is more Clearly visible intense intense intense 504 is more 859 is more Clearly visible Equal in intensity intense intense MAN 493 cm-1 and 519 649 cm-1 and 788 876 cm-1 and 1080 cm-1 -1 -1 cm cm 1133 cm-1 Both peaks are hardly visible and 788 is more Split in to two 876 is more 493 is more intense (1088 and 1079) intense intense 493 is slightly 649 is equal or 876 is more Single peak more intense more intense intense 519 is more 649 is more Has a shoulder to 876 is more intense intense left intense

ACS Paragon Plus Environment

17

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

{00-1}

493 is more intense -1

660 cm and 704 cm-1

649 is more intense -1

GAL and 890

Page 18 of 36

Single peak

830 cm cm-1 830 is more intense

956 cm-1 and 970 cm-1 956 is more intense

{0-10}

Equal in intensity

{001}

704 is more intense

890 is slightly more intense

970 is more intense

{101}

660 is more intense

890 is slightly more intense

970 is more intense

{011}

660 is more intense

830 is more intense

956 is more intense

1030-1090 cm-1 region Comprise of two major peaks Comprise of complex peak pattern Comprise of complex peak pattern Comprise of two major peaks

XYL and 886

855 cm 1061 cm-1 and 1122 cm-1 cm-1 1073 cm-1 Shoulder to the 855 is more 1061 is more {0-10} Clearly visible right intense intense Single symmetric 886 is more 1061 is more {01-1} Clearly visible peak intense intense Shoulder to the 855 is more Appears as a {-100} Equal in intensity right intense shoulder to 1109 855 is more {001} Split in to two Equal in intensity 1122 absent intense When comparing two peaks, the peak height relative to the other peak is reported 429 cm-1

-1

1133 is equal or more intense

ACS Paragon Plus Environment

18

Page 19 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

a)

b)

c)

ACS Paragon Plus Environment

19

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 36

Figure 5. Comparison of Raman spectra in the region of 400-1200 cm-1 for each distinguished face of a) MAN b) GAL and c) XYL. Validation of the Raman spectra library To demonstrate that the library of Raman spectra can be applied to challenging epitaxial crystal systems, we first used Raman face indexing on high-quality epitaxial pairs that have been indexed using SCXRD. For example, in the case of APAP-XYL system, Raman face indexing reveals that APAP is attached to {0-10} face of XYL by {001} face of APAP (Figure 6). This is in agreement with SCXRD data collected for this work as well as previous reports on APAP epitaxy. 16, 21 The same analysis was performed on MAN and GAL as substrates, where APAP was attached to {010} face of MAN and {001} face of GAL with a {001} face of APAP (SI S6). In all cases, Raman face indexing matched SCXRD results. In addition to manual Raman indexing, we used the developed algorithm to analyze these faces. The algorithm calculated the ideal spectra for each crystal face in the Raman library as an average of ten independently sampled spectra. The KLD was computed in the region of 400-1200 cm-1, and the statistical hypothesis testing was performed at the 99% confidence level. The comparison of manual and automated Raman face indexing and the residual sum of squares error (RSSE) between the ideal spectra and the test spectra under consideration can be found in Table 3. There is 100% match between automated and manual Raman face indexing, as well as SCXRD face indexing.

ACS Paragon Plus Environment

20

Page 21 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

Figure 6. Comparing the results obtained with Raman indexing with respect to SCXRD results of the APAP-XYL epitaxial crystal pair, a) XYL face identified by SCXRD b) APAP face identified by SCXRD c) comparison of the Raman spectra obtained for the APAP attached XYL face to the {0-10} face from the Raman library and d) comparison of the Raman spectra obtained for the APAP face to the {001} face from the Raman library. Table 3. Comparison of identified faces used in validation step Epitaxial system

APAP-MAN

APAP-GAL

APAP-XYL

Compound

SCXRD indexed face

Raman identified face (manual)

Raman identified face (automated)

RSSE

MAN

(010)

{010}

{010}

0.000029

APAP

(001)

{001}

{001}

0.000001

GAL

(00-1)

{001}

{001}

0.000028

APAP

(00-1)

{001}

{001}

0.000035

XYL

(0-10)

{0-10}

{0-10}

0.000026

APAP

(00-1)

{001}

{001}

0.000029

ACS Paragon Plus Environment

21

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 36

Application of Raman face indexing method to challenging epitaxial crystals In order to probe the scope of the Raman face indexing method, we next turned to more challenging systems, epitaxial crystals that we were not able to index using SCXRD. We analyzed crystals obtained under two conditions: crystals that were grown with the intention of SCXRD analysis but were not suitable for SCXRD due to various issues, and crystals that were obtained from induction time measurement experiments described in the method section. In majority of cases, crystals were not suitable for SCXRD analysis because of size (too large or to small), insufficient quality (twinning or polycrystallinity) or more than one APAP crystal have grown on top of the substrate. All these crystals were great candidates for Raman spectroscopy because there was no other feasible analytical technique to provide information on which crystal faces were in contact. Figure 7 shows pictures of some of these epitaxial crystals.

Figure 7. Some epitaxial crystal pairs analyzed with Raman spectroscopy, a) APAP-MAN b) APAP-GAL and c) APAP-XYL. To identify the face of the substrate, we focused the laser beam on the faces of the substrates that had APAP crystals deposited (Figure 8, SI S7), and we collected the Raman spectra. By comparing these spectra to our library, we identified the specific crystal face of the substrate, Table

ACS Paragon Plus Environment

22

Page 23 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

4, namely {010} face of MAN, {001} face of GAL and {0-10} face of XYL. The same result was obtained manually and using automated method. The results are also in agreement with the faces previously reported to be involved in epitaxial growth of APAP.8 The crystal face of the APAP was not identified because it would require removing the APAP from the substrate. In theory, it is possible to deduce the crystal faces of APAP facing substrate by indexing adjacent faces but that was beyond the scope of this work.

Figure 8. Raman analysis of APAP-MAN crystal pair a) Microscopic image of the crystal pair b) comparison of the collected Raman spectra on the MAN face with the Raman library. Table 4. Crystal face identification by Raman spectroscopy

Compound

Raman identified face (manual)

Raman identified face (automated)

RSSE

MAN

{010}

{010}

0.000021

GAL

{001}

{001}

0.000028

XYL

{0-10}

{0-10}

0.000012

We next focused on even more challenging application, high-throughput indexing of crystals obtained under actual induction time experimental conditions. Once again, these crystals are not suitable for face indexing using SCXRD due to poor quality, wrong size, and polycrystallinity, see

ACS Paragon Plus Environment

23

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 36

Figure 9. For this analysis, we collected Raman spectra on 25 crystal pairs of each APAP-substrate combination. For each of these crystal pairs, we collected three separate Raman spectra for the same unknown face of both substrate and overlayer (APAP). Initially, we used both the manual and automated methods to analyze the Raman spectra of at least five different crystal pairs with the three selected substrates. These results are summarized in Table 5 (see SI S8 for the images of some of the analyzed crystal pairs and SI S9 for complete analysis). When using the automated method, three spectra for each face were averaged before being compared to the ideal spectra in the Raman library using the KLD. When we compare the faces identified with manual and automatic methods, the results are in agreement in 95% of cases. The occasional disagreement likely occurs because the algorithm is more rigorous at detecting the changes in peak intensities including those arising from noise and variation in these spectra due to the quality of the crystals. There were also instances where the algorithm identified two different solutions for one face as shown in Table 5 for the APAP crystal. In this case, it is possible that the spectrum of the unknown crystal face is not within the Raman library itself, and the KLD distance between the Raman spectra of this unknown face and the Raman spectra of the identified faces from the library is roughly equidistant, leading to a false positive for each of these faces. This behavior can be seen from the similar RSSE values for the identified spectra, which indicates that the unknown spectra is comparably similar to each of the identified spectra from the Raman library. The epitaxial crystals produced from induction time experimental conditions have more varied crystal morphologies than the single crystal data from which the Raman library is constructed (since these crystals are grown under more stringent conditions), and therefore, the Raman library does not contain all possible faces. Despite these discrepancies, the vast majority of crystals were indexed accurately with high degree of confidence, as indicated by the small RSSEs values. Considering

ACS Paragon Plus Environment

24

Page 25 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

the accuracy of automated method and the difficulty of analyzing large number of samples with the manual method, the automated method was used in classifying the respective faces of the substrate and APAP in the remaining crystal pairs. The results from this analysis are given in the SI on S9. In the previous reports on epitaxial crystal growth32,37, due to lack of suitable analytical methods to identify crystal morphologies grown under actual crystallization conditions, the crystallization process needed to be adjusted for the purpose of obtaining single crystals of suitable quality for face indexing by SCXRD and had assumed that these crystals are representatives of crystals obtained under actual crystallization conditions. Although this is a reasonable assumption, there is no certainty that the changes in crystallization conditions do not induce major changes in crystal growth and resulting morphology. The results of our high throughput face indexing with Raman spectroscopy demonstrated the invalidity of this assumption. When we compared the faces we identified with the well-grown single crystal epitaxy systems and the epitaxy crystal pairs obtained under induction experimental conditions, the substrate and APAP faces that are attached to each other were different. For example, in APAP-MAN pair, in the crystals grown for the purpose SCXRD, APAP had always deposited on {010} face of MAN via {001} face of APAP. However in the crystal pairs obtained under induction experimental conditions, APAP was deposited on either {110}, {010}, {100} or {00-1} faces of MAN attached via one of the {011}, {001} or {111} faces. When growing crystals that are suitable to be face indexed with SCXRD, the quality and size of substrate crystals were well controlled and the growing of APAP was conducted only on selected morphologically similar substrate crystals. As a result, the faces of the substrate that APAP was attached to were always the same. However, in induction time experimental conditions, vials contain hundreds of substrate crystals (See SI S10), and only the size distribution of these

ACS Paragon Plus Environment

25

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 36

substrate crystals were controlled, but not the other features as crystal quality or morphology. This leads to the presence of multiple other possible faces on these substrate crystals and the absence of some of the faces observed in SCXRD quality crystals. Consequently, the attachment of the APAP crystals to the same substrate face in all the instances would not be expected. The Raman face indexing method that we have described here provides an effective and robust method to identify individual crystal faces of these types of challenging crystal systems.

Figure 9. Epitaxy crystal pairs induction time experimental conditions a) APAP-MAN b) APAPGAL and c) APAP-XYL Table 5. Raman identification of attached faces for crystals obtained under induction time experimental conditions. The crystal faces highlighted in red show cases in which the manual identification and automated identification did not agree. For the APAP face, there are cases where two crystal faces are identified by the automated method, indicating that the algorithm could not distinguish between these two faces when classifying the unknown crystal face.

Epitaxy system

APAPDMAN

Crystal pair

APAP face

Substrate face Manual

Automated

RSSE

Manual Automated

RSSE

Pair 1

{110}

{100}

0.000024

{011}

{011}

0.000023

Pair 2

{010}

{010}

0.000061

{011}

{011}

0.000079

ACS Paragon Plus Environment

26

Page 27 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

APAPGAL

APAPXYL

{-111}

(0.000053)

Pair 3

{010}

{010}

0.000045

{011}

{011}

0.000039

Pair 4

{110}

{110}

0.000027

{001}

{001}

0.000097

Pair 5

{110}

{110}

0.000036

{-111}

{-111}

0.000033

Pair 1

{001}

{001}

0.000041

{-101}

{-101}

0.000007

Pair 2

{001}

{001}

0.000021

{110}

{-111}

0.000021

Pair 3

{001}

{001}

0.000072

{-111}

{-111}

0.000010

Pair 4

{001}

{001}

0.000058

{011}

{011} {110}

0.000184 (0.000083)

Pair 5

{001}

{001}

0.000017

{001}

{001}

0.000093

Pair 1

{-100}

{-100}

0.000034

{011}

{011}

0.000026

Pair 2

{001}

{001}

0.000048

{-101}

{-101}

0.000014

Pair 3

{001}

{-100}

0.000022

{-101}

{-101}

0.000099

{-111}

(0.000061)

Pair 4

{-100}

{-100}

0.000078

{011}

{011}

0.000013

Pair 5

{-100}

{-100}

0.000095

{011}

{011}

0.000023

CONCLUSION We have developed a crystal face identification method based on Raman spectroscopy and demonstrated that the method is applicable to a wide range of crystallizations, from isolated single crystals to challenging multi-crystalline systems, such as conjoined crystals and crystals that are of unsuitable size and quality for SCXRD. Moreover, the method can be applied to hundreds or even thousands of crystals within a system, something that is not feasible using SCXRD. We demonstrated the applicability of the method on three different epitaxial systems where APAP was grown on MAN, GAL, and XYL substrates. With this method, we have shown that in epitaxial

ACS Paragon Plus Environment

27

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 36

crystal systems, if the crystallization conditions such as substrate crystal morphology and size were well controlled, the overlayer would preferentially grow on a selected face of the substrate. In the reported systems, APAP prefers to grow on a {010} face of MAN, {001} face of GAL, and {010} face of XYL. In each of these instances, APAP was attached with a {001} face to the respective substrate. In addition, we observed that when the crystallization conditions were uncontrolled, the face to which the APAP is attached can be varied with respect to both APAP and substrate faces. This demonstrates the importance of face indexing on the crystals obtained under actual production/experimental conditions, in order to evaluate the morphologies applicable to control the nucleation, growth, and property variations, which is not feasible to do with SCXRD. With its applicability and simplicity, we expect this robust and efficient method based on Raman spectroscopy to find many applications where SCXRD face indexing and other methods may not be applicable.

ASSOCIATED CONTENT The Supporting Information is available free of charge on the ACS Publications website (SCXRD face indexing data, Raman spectral analysis and microscopic images of analyzed crystals). AUTHOR INFORMATION Corresponding Author *E-mail: [email protected] Author Contributions # These authors contributed equally to this work.

ACS Paragon Plus Environment

28

Page 29 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

ACKNOWLEDGMENT The authors would like to thank Dr. Peter Mueller and Dr. Jonathan Becker (Department of Chemistry, Massachusetts Institute of Technology) for their assistance with the SCXRD work and the Novartis-MIT Center for Continuous Manufacturing for funding. REFERENCES (1)

Beyer, T.; Day, G. M.; Price, S. L. The Prediction, Morphology, and Mechanical Properties of the Polymorphs of Paracetamol. J. Am. Chem. Soc 2001, 123, 5086–5094.

(2)

Li, J.; Doherty, M. F. Steady State Morphologies of Paracetamol Crystal from Different Solvents. Cryst. Growth Des. 2017, 17, 659–670.

(3)

Clydesdale, G.; Roberts, K. J.; Telfer, G. B.; Grant, D. J. W. Modeling the Crystal Morphology of α-Lactose Monohydrate. J. Pharm. Sci. 1997, 86 (1), 135–141.

(4)

Zhang, M.; Liang, Z.; Wu, F.; Chen, J.-F.; Xue, C.; Zhao, H. Crystal Engineering of Ibuprofen Compounds: From Molecule to Crystal Structure to Morphology Prediction by Computational Simulation and Experimental Study. J. Cryst. Growth 2017, 467, 47–53.

(5)

Pingali, K. C.; Shinbrot, T.; Cuitino, A.; Muzzio, F. J.; Garfunkel, E.; Lifshitz, Y.; Mann, A. B. AFM Study of Hydrophilicity on Acetaminophen Crystals. Int. J. Pharm. 2012, 438 (1–2), 184–190.

(6)

Urbelis, J. H.; Swift, J. A. Solvent Effects on the Growth Morphology and Phase Purity of CL-20. Cryst. Growth Des. 2014, 14 (4), 1642–1649.

(7)

Dandekar, P.; Kuvadia, Z. B.; Doherty, M. F. Engineering Crystal Morphology. Annu. Rev.

ACS Paragon Plus Environment

29

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 36

Mater. Res. 2013, 43 (1), 359–386. (8)

Bealing, C. R.; Baumgardner, W. J.; Choi, J. J.; Hanrath, T.; Hennig, R. G. Predicting Nanocrystal Shape through Consideration of Surface-Ligand Interactions. ACS Nano 2012, 6 (3), 2118–2127.

(9)

Kiang, Y.-H.; Yang, C.-Y.; Staples, R. J.; Jona, J. Crystal Structure, Crystal Morphology, and Surface Properties of an Investigational Drug. Int. J. Pharm. 2009, 368 (1–2), 76–82.

(10)

Morris, K. R.; Schlam, R. F.; Cao, W.; Short, M. S. Determination of Average Crystallite Shape by X-Ray Diffraction and Computational Methods. J. Pharm. Sci. 2000, 89 (11), 1432–1442.

(11)

Simov, S.; Simova, E.; Davidkov, B.; Mechenov, G. A Geometric Method Incorporated with a Computer Program for Indexing Crystal Faces of Microcrystallites. J. Appl. Crystallogr. 1983, 16 (5), 559–562.

(12)

Prigodich, R. V.; Zager, M. Indexing Crystal Faces. Powder Diffr. 1995, 10 (02), 127–128.

(13)

Kiang, Y. H.; Shi, H. G.; Mathre, D. J.; Xu, W.; Zhang, D.; Panmai, S. Crystal Structure and Surface Properties of an Investigational Drug—A Case Study. Int. J. Pharm. 2004, 280 (1–2), 17–26.

(14)

Sullivan, R. A.; Davey, R. J. Concerning the Crystal Morphologies of the α and β Polymorphs of P-Aminobenzoic Acid. CrystEngComm 2015, 17 (5), 1015–1023.

(15)

Deepthy, A.; Bhat, H. L. Growth and Characterization of Ferroelectric Glycine Phosphite Single Crystals. J. Cryst. Growth 2001, 226 (2–3), 287–293.

ACS Paragon Plus Environment

30

Page 31 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

(16)

Strom, C. S. Indexing Crystal Faces on SEM Photographs. J. Appl. Crystallogr. 1976, 9 (4), 291–295.

(17)

Rietveld, H. M. A Profile Refinement Method for Nuclear and Magnetic Structures. J. Appl. Crystallogr. 1969, 2 (2), 65–71.

(18)

Pagola, S.; Stephens, P. W.; Bohle, D. S.; Kosar, A. D.; Madsen, S. K. The Structure of Malaria Pigment β-Haematin. Nature 2000, 404 (6775), 307–310.

(19)

Bergmann, J.; Le Bail, A.; Shirley Iii, R.; Zlokazov, V. Renewed Interest in Powder Diffraction Data Indexing. Z. Krist. 2004, 219, 783–790.

(20)

Le Bail, A.; Duroy, H.; Fourquet, J. L. Ab-Initio Structure Determination of LiSbWO6 by X Ray Powder Diffraction. Mat. Res. Bull 1988, 23, 447–452.

(21)

Aubrey‐Medendorp, C.; Parkin, S.; Li, T. The Confusion of Indexing Aspirin Crystals. J. Pharm. Sci. 2008, 97 (4), 1361–1367.

(22)

Knoesen, D.; Kritzinger, S. Microtopographical Analysis of Surface Structures in a Scanning Electron Microscope. J. Microsc. 1983, 132 (1), 87–96.

(23)

Johnston, C. T.; Helsen, J.; Schoonheydt, R. A.; Bish, D. L.; Agnew, S. F. Single-Crystal Raman Spectroscopic Study of Dickite. Am. Mineral. 1998, 83, 75–84.

(24)

Beattie, I. R.; Gilson, T. R. Single Crystal Laser Raman Spectroscopy. Proc. R. Soc. A Math. Phys. Eng. Sci. 1968, 307 (1491), 407–429.

(25)

Moriyama, K.; Furuno, N.; Yamakawa, N. Crystal Face Identification by Raman Microscopy for Assessment of Crystal Habit of a Drug. Int. J. Pharm. 2015, 480 (1), 101–

ACS Paragon Plus Environment

31

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 36

106. (26)

Moriyama, K. Advanced Applications of Raman Imaging for Deeper Understanding and Better Quality Control of Formulations. Curr. Pharm. Des. 2016, 22, 4912–4916.

(27)

Moriyama, K.; Onishi, H.; Ota, H. Pharmaceutica Analytica Acta Visualization of Primary Particles in a Tablet Based on Raman Crystal Orientation Mapping. Pharm Anal Acta 2015, 6.

(28)

Cowley, R. A. The Theory of Raman Scattering from Crystals. Proc. Phys. Soc. 1964, 84 (2), 281–296.

(29)

Stoner-Ma, D.; Skinner, J. M.; Schneider, D. K.; Cowan, M.; Sweet, R. M.; Orville, A. M.; D., S.; W., N.; M., S. R. Single-Crystal Raman Spectroscopy and X-Ray Crystallography at Beamline X26-C of the NSLS. J. Synchrotron Radiat. 2011, 18 (1), 37–40.

(30)

Mannsfeld, S. C. B.; Fritz, T. Understanding Organic–inorganic Heteroepitaxial Growth of Molecules on Crystalline Substrates: Experiment and Theory. Phys. Rev. B 2005, 71 (23), 235405.

(31)

Chadwick, K.; Myerson, A.; Trout, B. Polymorphic Control by Heterogeneous Nucleation - A New Method for Selecting Crystalline Substrates. CrystEngComm 2011, 13 (22), 6625.

(32)

Chadwick, K.; Chen, J.; Myerson, A. S.; Trout, B. L. Toward the Rational Design of Crystalline Surfaces for Heteroepitaxy: Role of Molecular Functionality. Cryst. Growth Des. 2012, 12 (3), 1159–1166.

(33)

Hooks, D. E.; Fritz, T.; Ward, M. D. Epitaxy and Molecular Organization on Solid Substrates. Adv. Mater. 2001, 13 (4), 227–241.

ACS Paragon Plus Environment

32

Page 33 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

(34)

Ward, M. D. Bulk Crystals to Surfaces: Combining X-Ray Diffraction and Atomic Force Microscopy to Probe the Structure and Formation of Crystal Interfaces. Chem. Rev. 2001, 101 (6), 1697–1725.

(35)

Bonafede, S. J.; Ward, M. D. Selective Nucleation and Growth of an Organic Polymorph by Ledge-Directed Epitaxy on a Molecular Crystal Substrate. J. Am. Chem. Soc. 1995, 117 (30), 7853–7861.

(36)

Olmsted, B. K.; Ward, M. D. The Role of Chemical Interactions and Epitaxy during Nucleation of Organic Crystals on Crystalline Substrates. CrystEngComm 2011, 13, 1070– 1073.

(37)

Wijethunga, T. K.; Baftizadeh, F.; Stojaković, J.; Myerson, A. S.; Trout, B. L. Experimental and Mechanistic Study of the Heterogeneous Nucleation and Epitaxy of Acetaminophen with Biocompatible Crystalline Substrates. Cryst. Growth Des. 2017, 17 (7), 3783–3795.

(38)

Markey, M. K.; Tourassi, G. D.; Floyd, C. E. Decision Tree Classification of Proteins Identified by Mass Spectrometry of Blood Serum Samples from People with and without Lung Cancer. Proteomics 2003, 3 (9), 1678–1679.

(39)

Liu, H.; Li, J.; Wong, L. A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns. Genome informatics 2002, 13, 51–60.

(40)

Yang, H.; Griffiths, P. R.; Tate, J. D. Comparison of Partial Least Squares Regression and Multi-Layer Neural Networks for Quantification of Nonlinear Systems and Application to Gas Phase Fourier Transform Infrared Spectra. Anal. Chim. Acta 2003, 489 (2), 125–136.

ACS Paragon Plus Environment

33

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(41)

Page 34 of 36

Zou, T.; Dou, Y.; Mi, H.; Zou, J.; Ren, Y. Support Vector Regression for Determination of Component of Compound Oxytetracycline Powder on Near-Infrared Spectroscopy. Anal. Biochem. 2006, 355 (1), 1–7.

(42)

Nam, S. H.; Han, S. H.; Lee, Y. Soft Independent Modeling of Class Analogy (SIMCA) Modeling of Laser-Induced Plasma Emission Spectra of Edible Salts for Accurate Classification. Appl. Spectrosc. 2017, 71 (9), 2199–2210.

(43)

Fremout, W.; Kuckova, S.; Crhova, M.; Sanyova, J.; Saverwyns, S.; Hynek, R.; Kodicek, M.; Vandenabeele, P.; Moens, L. Classification of Protein Binders in Artist’s Paints by Matrix-Assisted Laser Desorption/Ionisation Time-of-Flight Mass Spectrometry: An Evaluation of Principal Component Analysis (PCA) and Soft Independent Modelling of Class Analogy (SIMCA). Rapid Commun. Mass Spectrom. 2011, 25 (11), 1631–1640.

(44)

Balabin, R. M.; Safieva, R. Z. Gasoline Classification by Source and Type Based on near Infrared (NIR) Spectroscopy Data. Fuel 2008, 87 (7), 1096–1101.

(45)

Jiao, L.; Bing, S.; Zhang, X.; Wang, Y.; Li, H. Application of Fluorescence Spectroscopy Combined with Interval Partial Least Squares to the Determination of Enantiomeric Composition of Tryptophan. Chemom. Intell. Lab. Syst. 2016, 156, 181–187.

(46)

Luinge, H. J.; van der Maas, J. H.; Visser, T. Partial Least Squares Regression as a Multivariate Tool for the Interpretation of Infrared Spectra. Chemom. Intell. Lab. Syst. 1995, 28 (1), 129–138.

(47)

Wilcox, K. E.; Blanch, E. W.; Doig, A. J. Determination of Protein Secondary Structure from Infrared Spectra Using Partial Least-Squares Regression. Biochemistry 2016, 55 (27),

ACS Paragon Plus Environment

34

Page 35 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Langmuir

3794–3802. (48)

Madsen, A. Ø.; Mason, S.; Larsen, S. A Neutron Diffraction Study of Xylitol: Derivation of Mean Square Internal Vibrations for H Atoms from a Rigid-Body Description. Acta Crystallogr. Sect. B Struct. Sci. 2003, 59 (5), 653–663.

(49)

Ristic, R. I.; Finnie, S.; Sheen, D. B.; Sherwood, J. N. Macro- and Micromorphology of Monoclinic Paracetamol Grown from Pure Aqueous Solution. J. Phys. Chem. B 2001, 105 (38), 9057–9066.

(50)

Kullback, S. Information Theory and Statistics; Dover Publications, 1997.

(51)

Haisa, M.; Kashino, S.; Kawai, R.; Maeda, H. The Monoclinic Form of PHydroxyacetanilide. Acta Crystallogr. Sect. B Struct. Crystallogr. Cryst. Chem. 1976, 32 (4), 1283–1285.

(52)

Kaminsky, W.; Glazer, A. M. Crystal Optics of D-Mannitol, C6H14O6: Crystal Growth, Structure, Basic Physical Properties, Birefringence, Optical Activity, Faraday Effect, Electro-Optic Effects and Model Calculations. Zeitschrift für Krist. - Cryst. Mater. 1997, 212 (4), 283–296.

(53)

Kouwijzer, M. L. C. E.; van Eijck, B. P.; Kooijman, H.; Kroon, J. An Extension of the GROMOS Force Field for Carbohydrates, Resulting in Improvement of the Crystal Structure Determination of α- D -Galactose. Acta Crystallogr. Sect. B Struct. Sci. 1995, 51 (2), 209–220.

ACS Paragon Plus Environment

35

Langmuir 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 36

SYNOPSIS A Raman spectroscopy-based method for the identification of crystal faces are described. The method is combined with an automated machine learning algorithm for high throughput analysis of crystals obtained under actual experimental conditions, which are often not suitable for face indexing with single crystal X-ray diffraction.

ACS Paragon Plus Environment

36