Anal. Chem. 2008, 80, 6658–6665
Advanced Calibration Strategy for in Situ Quantitative Monitoring of Phase Transition Processes in Suspensions Using FT-Raman Spectroscopy Zeng-Ping Chen,*,† Gilles Fevotte,‡,§ Alexandre Caillet,§ David Littlejohn,† and Julian Morris| Centre for Process Analytics and Control Technology, Department of Pure and Applied Chemistry, University of Strathclyde, Glasgow, G1 1XL, Scotland, U.K., Ecole Nationale Supe´rieure des Mines de Saint Etienne, Centre SPIN, ´ tienne, France, LAGEP-Universite´ Lyon I/CNRS-ESCPE Lyon, Baˆt 308, 43 Boulevard 158 Cours Fauriel-42023 Saint-E du 11 Novembre 1918, 69622 Villeurbanne Cedex, France, and Centre for Process Analytics and Control Technology, School of Chemical Engineering and Advanced Materials, Newcastle University, NE1 7RU, England, U.K. There is an increasing interest in using Raman spectroscopy to identify polymorphic forms and monitor phase changes in pharmaceutical products for quality control. Compared with other analytical techniques for the identification of polymorphs such as X-ray powder diffractometry and infrared spectroscopy, FT-Raman spectroscopy has the advantages of enabling fast, in situ, and nondestructive measurements of complex systems such as suspension samples. However, for suspension samples, Raman intensities depend on the analyte concentrations as well as the particle size, overall solid content, and homogeneity of the solid phase in the mixtures, which makes quantitative Raman analysis rather difficult. In this contribution, an advanced model has been derived to explicitly account for the confounding effects of a sample’s physical properties on Raman intensities. On the basis of this model, a unique calibration strategy called multiplicative effects correction (MEC) was proposed to separate the Raman contributions due to changes in analyte concentration from those caused by the multiplicative confounding effects of the sample’s physical properties. MEC has been applied to predict the anhydrate concentrations from in situ FT-Raman measurements made during the crystallization and phase transition processes of citric acid in water. The experimental results show that MEC can effectively correct for the confounding effects of the particle size and overall solid content of the solid phase on Raman intensities and, therefore, provide much more accurate in situ quantitative predictions of anhydrate concentration during crystallization and phase transition processes than traditional PLS calibration methods. INTRODUCTION Pharmaceutical Crystallization. Crystallization is widely used in pharmaceuticals and chemicals production as a purification * Corresponding author. E-mail:
[email protected]; hotmail.com. † University of Strathclyde. ‡ Ecole Nationale Supe´rieure des Mines de Saint Etienne. § LAGEP-Universite´ Lyon I/CNRS-ESCPE Lyon. | Newcastle University.
6658
zpchen2002@
Analytical Chemistry, Vol. 80, No. 17, September 1, 2008
operation that results in a solid product. During a crystallization process a chemical compound may manifest as more than one polymorphic form.1 Different polymorphic structures may result in markedly different physical and chemical properties such as solubility, color, and even bioavailability, which play an important role in the performance of the final product.2 Consequently, understanding, modeling, and controlling crystallization process are essential to produce products with the desired physical properties. Moreover, it should be stressed that even when studies of the phase equilibria governing the crystallization systems have been performed, the unexpected appearance or disappearance of a polymorphic form can jeopardize production. Undesirable generation of new polymorphs can lead to serious consequences if the transition takes place during the crystallization process or during further downstream operations, including storage of the final solid dosage form.3-6 This is the reason for the dramatic increase in the need for fast and simple at-line/online/in-line process analytical technologies for the identification of polymorphs in the pharmaceutical industry.7,8 Quantitative Raman Spectroscopy. FT-Raman spectroscopy appears recently as one of the fastest, most reliable, and most suitable techniques to identify polymorphic forms and can be easily exploited routinely to monitor phase changes in pharmaceutical products and quality control assays.9 There are several advantages of using FT-Raman spectroscopy3,10 for the identification of polymorphs over other analytical techniques such as X-ray (1) Bernstein, J. Polymorphism in Molecular Crystals; Claredon Press: Oxford, U.K., 2002. (2) Brittain, H. G. Polymorphism in Pharmaceutical Solids; Marcel Dekker: New York, 1999. (3) Threlfall, T. L. Analyst 1995, 120, 2435–60. (4) Bernstein, J.; Henck, J. O. Cryst. Eng. 1998, 1, 119–128. (5) Bauer, J.; Spanton, S.; Henry, R.; Quick, J.; Dziki, W.; Porter, W.; Morris, J. Pharm. Res. 2001, 18, 859–866. (6) Zhang, G. G. Z.; Gu, C.; Zell, M. T.; Burkhardt, R. T.; Munson, E. J.; Grant, D. J. W. J. Pharm. Sci. 2002, 91, 1089–1100. (7) Workman, J., Jr.; Koch, M.; Veltkamp, D. J. Anal. Chem. 2003, 75, 2859– 2876. (8) Folestad, S.; Johansson, J. Eur. Pharm. Rev. 2003, 4, 36–42. (9) Auer, M. E.; Griesser, U. J.; Sawatzki, J. J. Mol. Struct. 2003, 661-662, 307–317. (10) Tian, F.; Zeitler, J. A.; Strachan, C. J.; Saville, D. J.; Gordon, K. C.; Rades, T. J. Pharm. Biomed. Anal. 2006, 40, 271–280. 10.1021/ac800987m CCC: $40.75 2008 American Chemical Society Published on Web 07/30/2008
powder diffractometry and infrared spectroscopy. For examples, almost no sample preparation is required for Raman spectroscopy, which facilitates the in-line/online quantitative analysis of polymorphic mixtures. Moreover, since water has a very weak Raman spectrum, aqueous slurry samples can be directly analyzed. These advantages have stimulated growing interest in the implementation of Raman spectroscopy for the analysis of pharmaceutical compounds.11 Even so, a surprisingly small number of quantitative Raman studies have been reported.12-27 Moreover, compared with other optical spectroscopies, most of the quantitative applications of Raman spectroscopy concern analysis of homogeneous solutions.12-17 In contrast, many issues remain unresolved regarding the in-line/in situ quantitative measurement of the quality of the dispersed solid phase by Raman spectroscopy. There are, therefore, few reported applications of this type in the open literature.18-27 In Raman scattering spectroscopy, the intensity of analyte peaks depends on not only the analyte concentration but also on the intensity of the excitation source, the instrument’s optical configuration, and the sample alignment. Therefore, to gain quantitative information requires the use of internal or external standards.15,16,20,28 Band ratios between the overall Raman intensities and that of an individual spectral peak arising from internal or external standards are calculated and used in quantitative analysis. For solutions, the reference peak is usually a solvent peak which is unaffected by the analyte concentration. In the absence of a reference peak in the sample matrix, an inert internal standard is often added. Sometimes, external standards are used, but this approach can be difficult to apply accurately in many in situ process analysis applications. For samples involving solids such as suspensions, quantitative Raman analysis becomes even more difficult, because the Raman intensity from such samples depends on the particle size, density, and homogeneity of the mixtures,20,23 which hinders the use of an internal or external standard. For example, during a crystallization process, the solid content is not constant but rather (11) Mulvaney, S. P.; Keating, C. D. Anal. Chem. 2000, 72, 145R–157R. (12) Pelletier, M. J. Appl.Spectrosc. 2003, 57, 20A–42A. (13) Sato-Berru´, R. Y.; Medina-Valtierra, J.; Medina-Gutie´rrez, C.; Frausto-Reyes, C. Spectrochim. Acta, Part A 2004, 60, 2225–2229. (14) Szostak, R.; Mazurek, S. J. Mol. Struct. 2004, 704, 235–245. (15) Scheweinsberg, D. P.; West, Y. D. Spectrochim. Acta, Part A 1997, 53, 25–34. (16) Aarnoutse, P. J.; Westerhuis, J. A. Anal. Chem. 2005, 77, 1228–1236. (17) De Paepe, A. T. G.; Dyke, J. M.; Hendra, P. J.; Langkilde, F. W. Spectrochim. Acta, Part A 1997, 53, 2267–2273. (18) Caillet, A.; Rivoire, A.; Galvan, J. M.; Puel, F.; Fe´votte, G. Cryst. Growth Des. 2007, 7 (10), 2080–2087. (19) Fe´votte, G. Chem. Eng. Res. Des. 2007, 85 (A7), 906–920. (20) Caillet, A.; Puel, F.; Fevotte, G. Int. J. Pharm. 2006, 307, 201–208. (21) Hu, Y. R.; Liang, J. K.; Myerson, A. S.; Taylor, L. S. Ind. Eng. Chem. Res. 2005, 44, 1233–1240. (22) Ono, T.; Kramer, H. J. M.; ter Horst, J. H.; Jansens, P. J. Cryst. Growth Des. 2004, 4, 1161–1167. (23) O’Sullivan, B.; Barrett, P.; Hsiao, G.; Carr, A.; Glennon, B. Org. Process Res. Dev. 2003, 7, 977–982. (24) Scho ¨ll, J.; Bonalumi, D.; Vicum, L.; Mazzotti, M.; Muller, M. Cryst. Growth Des. 2006, 6, 881–891. (25) Starbuck, C.; Spartalis, A.; Wai, L.; Wang, J.; Fernandez, P.; Lindemann, C. M.; Zhou, G. X.; Ge, Z. Cryst. Growth Des. 2002, 2, 515–522. (26) Wang, F.; Wachter, J. A.; Antosz, F. J.; Berglund, K. A. Org. Proc. Res. Dev. 2000, 4, 391–395. (27) Agarwal, P.; Berglund, K. A. Cryst. Growth Des. 2003, 3, 941–946. (28) Zheng, X.; Fu, W.; Albin, S.; Wise, K. L.; Javey, A.; Cooper, J. B. Appl. Spectrosc. 2001, 555, 382–388.
increases from zero to a certain value; and the particle size of crystals also evolves during the whole process. These features of a crystallization process make the in-line quantitative use of Raman spectroscopy complicated, as these variations must be taken into account and successfully modeled during the development of the calibration procedure. For complicated solvent-mediated polymorphic form transformation processes, where the Raman intensities depend on not only the concentration of individual polymorphic forms but also on the overall solid content and particle size of the solid phases in the mixtures, the applicability of univariate calibration models based on band ratios is largely unsuccessful. To facilitate quantitative analysis for more complex suspension samples, multivariate calibration methods such as PCA and PLS have recently received considerable attention in quantitative FT-Raman spectroscopy. The application of these multivariate calibration methods has some advantages over univariate band ratio calibration models in the quantitative analysis of FT-Raman measurements.12,14 However, when analyzing suspension samples using FT-Raman spectroscopy, the variations in the samples’ physical properties such as particle size, overall solid content, and homogeneity of mixtures have confounding effects on the total Raman intensities. Such confounding effects are difficult to capture through the application of standard multivariate calibration methods and, therefore, will affect the prediction accuracy of multivariate calibration models. The objectives of this study are (1) to derive a model to explicitly account for the confounding effects of particle size and density of solid phases on Raman intensities, (2) to develop a calibration strategy to separate the Raman contributions due to the changes of concentration of individual polymorphic forms from the confounding effects of samples’ physical properties, and (3) to achieve accurate quantitative determination of the concentration of individual polymorphic forms during solvent-mediated phase transition processes. THEORY Raman Intensities of Suspension Systems. The intensity of Raman bands depends on a complex expression involving the polarizability tensor of a molecule.29 For analytical purposes, the following less rigorous linear model analogous to the Beer-Lambert law can be used. I(v) ) nR(v)Io
(1)
where I(ν) is the measured Raman intensity at Raman shift ν, Io is the intensity of the excitation radiation, n is the number of molecules of the analyte illuminated by the source and viewed by the spectrometer, and R(ν) is a composite term that represents the overall spectrometer response and the self-absorption and molecular scattering properties of the analyte. For K suspension samples consisting of two polymorphic forms (say, R and β forms) of a chemical compound, their overall Raman intensities can be expressed as the linear combination of the contributions of the two polymorphic forms in the solid state, the dissolved solute, the solvent, and other possible interference(s) such as fluorescence. (29) Anderson, A. The Raman Effect, Vol. 2: Applications; Marcel Dekker, Inc: New York, 1973.
Analytical Chemistry, Vol. 80, No. 17, September 1, 2008
6659
Ik(v) ) [nR,kRR(v) + nβ,kRβ(v) + nsolute,kRsolute(v) + nsolvent,kRsolvent(v) + Rinterf(ν)]Io,k
Ik(v) ) qkwR,k[RR(v) - Rβ(v)] + qkRβ(v) + [csolute,kVRsolute(v) + (2)
csolvent,kVRsolvent(v) + Rinterf(v)]Io,k (6)
Suppose dk is the overall solid concentration of the solid phase in the kth suspension sample. V denotes the volume illuminated by the source and viewed by the spectrometer. csolute,k, csolvent,k, wR,k, and wβ,k (wR,k + wβ,k ) 1) signify the molar concentrations of the solute and solvent and the mass fractions of the R and β form solids in the kth suspension sample, respectively, and M is the molecular weight of the chemical compound. Equation 2 then becomes
Assuming the Raman responses of the R and β form solids, solute, solvent, and possible interferences (RR(v), Rβ(v), Rsolute(v), Rsolvent(v), and Rinterf(v)) are linearly independent of each other, it can be seen that the relationship between Raman spectra (Ik, k ) 1, 2,..., K) and the mass fractions of the R form solids (wR,k, k ) 1, 2,..., K) is nonlinear. A straightforward multivariate linear calibration model can be build only between Ik and wR,kqk (or qk). In order to apply multivariate bilinear calibration methods such as PLS to this type of Raman spectral data, the multiplicative parameter qk for each suspension sample should be estimated out, which can be achieved by the optical path-length estimation and correction method (OPLEC) of Chen et al.31 After the estimation of qk (k ) 1, 2,..., K), the following two calibration models can be built by multivariate linear calibration methods such as PLS
Ik(v) )
[
k ) 1, 2, ..., K
wR,kdkV wβ,kdkV RR(v) + Rβ(v) + csolute,kVRsolute(v) + M M
]
csolvent,kVRsolvent,(v) + Rinterf(υ) Io,k (3) Equation 3 holds only when the particle size of the solid phase in these suspension samples is constant or has no significant effects on Raman intensities. However, the particle size of the solid phase generally varies during chemical and pharmaceutical processes. It was also found that if the average diameter of particles is much greater than the wavelength of the radiation scattered by the particles, the relative Raman intensities of all the Raman peaks for the sample compound vary linearly with the particle size.30 Therefore, another multiplicative parameter pk should be introduced in eq 3 to account for the effects of particle size. For simplicity, it is assumed that the particle size of the two polymorphic forms in the same sample is approximately the same.
[
Ik(v) ) pk
wR,kdkV wβ,kdkV RR(v) + pk Rβ(v) + csolute,kVRsolute(v) + M M
]
csolvent,kVRsolvent(v) + Rinterf(υ) Io,k (4) Since the effects of parameters pk, dk, and Io,k on the Raman contributions of the two polymorphic forms have the same multiplicative nature, eq 4 can be simplified as follows. Ik(v) ) qkwR,kRR(v) + qkwβ,kRβ(v) + [csolute,kVRsolute(v) + csolvent,kVRsolvent(v) + Rinterf(υ)]Io,k (5) where
qk )
pkdkVIo,k M
As seen from eq 5, the multiplicative parameter qk may be different for each suspension sample. In order to extract quantitative information about the R or β polymorphic forms in the suspension samples from the Raman measurements, parameter qk for each suspension sample needs to be known and its confounding effect removed. Estimation and Correction of Multiplicative Parameter qk. Since wR,k + wβ,k ) 1, eq 5 can be rewritten as (30) Pellow-Jarman, M. V.; Hendra, P. J.; Lehnert, R. J. Vib. Spectrosc. 1996, 12, 257–261.
6660
Analytical Chemistry, Vol. 80, No. 17, September 1, 2008
diag(wR)q ) [1,Z]b1, Z ) [I1;I2;...;IK],
q ) [1,Z]b2, wR ) [wR,1;wR,2;...;wR,K], q ) [q1;q2;...;qK] (7)
where diag(wR) denotes the diagonal matrix in which the corresponding diagonal elements are the elements of wR; 1 is a column vector with its elements equal to unity. The two estimated regression vectors b1 and b2 could then be used to predict the mass fraction of the R form in any test sample from its Raman spectrum Itest.
qtestwR,test ) [1,Itest]b1,
qtest ) [1,Itest]b2,
wR,test )
[1,Itest]b1 [1,Itest]b2 (8)
The mass fraction of the β form in the test sample can be obtained in a similar way or rather simply be calculated as 1 - wR,test. The above method for the estimation and correction of multiplicative parameter qk is based on the OPLEC method of Chen et al.31 As the name implies, OPLEC was originally designed to correct for the variations of optical path-length caused by physical light scattering. Since the nature of the confounding effects of particle size and overall solid content of the solid phase discussed in this paper has nothing to do with optical path-length, the above procedure for modeling the confounding effects is renamed as the multiplicative effects correction (MEC) method hereafter. EXPERIMENTAL SETUP AND PROCEDURE Apparatus. Figure 1 is a schematic representation of the laboratory-scale crystallization equipment used to perform solventmediated phase transition experiments. The glass reactor was equipped with a jacket and a condenser. A high efficiency profiled propeller was used to maintain homogeneous mixing of the particles in suspensions during the experiments. The whole operating device was instrumented and microcomputer-controlled to allow the tracking of temperature trajectories and/or of constant (31) Chen, Z. P.; Morris, J.; Martin, E. Anal. Chem. 2006, 78, 7674–7681.
Figure 1. Schematic representation of the laboratory-scale crystallization equipment used to perform solvent-mediated phase transition experiments.
temperature set-points. Raman measurements were recorded in situ using the ReactRA Raman spectrometer (Mettler-Toledo) equipped with an integrated stabilized 785 nm laser diode light source (nominal output, 300 mW), an open electrode charge coupled device (CCD) detector (1024 pixels × 256 pixels) cooled by a Peltier element, and a 16 mm diameter immersion Hastelloy probe sealed with a sapphire window. As shown in Figure 1, the probe was connected to the spectrometer using an optical fiber (200 µm diameter). The crystallization and phase transition process of citric acid in water was investigated as a model system. The solubility of citric acid in water is very high, about 1.5 kg/kg of water at 20 °C. The anhydrate is more stable than the monohydrate at temperatures above 34 °C, while the monohydrate becomes the stable form below this temperature. Both the monohydrate and anhydrous solids were purchased from Acros Organics and used in distilled water without further purification. The particle size distribution of the particles was not known. Calibration Measurements. To facilitate in situ quantitative monitoring of solvent-mediated phase transition processes of citric acid using FT-Raman spectroscopy, a calibration model is required to relate the concentrations of the individual polymorphic forms to the Raman measurements. For calibration purposes, 25 citric acid suspension samples with varying solid contents (5, 10, 15, 20, and 25 wt %) and different weight ratios of anhydrate: monohydrate (0:1, 0.25:0.75, 0.5:0.5, 0.75:0.25, and 1:0) were prepared in saturated solutions of monohydrate in a 250 mL crystallizer maintained at 15 °C. The Raman spectra were acquired at a spectral resolution of 7 cm-1; it was assumed that no significant phase transition occurred during the first acquisitions of spectra. Crystallization and Phase Transition Processes. During solid processing, the presence of liquid phase promotes the occurrence of phase transition phenomena. Solvent mediated phase transitions (SMPT) are known to take place during particulate processes and, in particular, during suspension crystal-
lization.32 Four different crystallization and phase transition processes of citric acid were studied in this paper. A 2.5 L temperature-controlled glass crystallizer was used for all four experiments. These experiments focused on the investigation of the nucleation and growth kinetics of monohydrate and anhydrous citric acid. In order to identify kinetic phenomena and improve the “observability” of the systems, it was essential to separate the various mechanisms involved in crystallization and phase transition processes. Therefore, seeding was performed in these experiments to avoid dealing with primary nucleation, on the one hand, and to ensure satisfactory experimental reproducibility, on the other hand. Seed crystals were prepared from sieved monohydrate or anhydrous particles, depending on the process in question. The size class of seed particles was selected to be in the range of 250-315 µm. Process 1 (Anhydrate to Monohydrate Phase Transition Process at 15 °C). An anhydrous suspension was prepared in the crystallizer through the isothermal seeded crystallization of a supersaturated citric acid solution. The amount of anhydrous citric acid crystals was between 6 and 6.8 wt %. The suspension was then left under stirring at 15 °C for 24 h in order to make sure that the anhydrous solubility was reached. Prior to the introduction of monohydrate seed to initiate the anhydrate to monohydrate phase transition process, about 3 wt % of monohydrate particles was dissolved in the crystallizer. The obtained suspension was therefore supersaturated with respect to the monohydrate form. And the solvent-mediated phase transition process was then initiated through the introduction of monohydrate seed particles. FT-Raman spectroscopy started to collect spectra after the introduction of seed particles. Process 2 (Anhydrate to Monohydrate Phase Transition Process at 32 °C). The same operating mode as during “Process 1” was applied, except the temperature which was maintained at 32 °C. Process 3 (Anhydrate Crystallization Process between 42 and 32 °C). Initially, 2 L of supersaturated citric acid solution was prepared in the crystallizer at 42 °C. The temperature of the (32) Cardew, P. T.; Davey, R. J. Proc. R. Soc. London, Ser. A 1985, 398, 415– 428.
Analytical Chemistry, Vol. 80, No. 17, September 1, 2008
6661
Figure 2. (a) The Raman spectra for suspension samples with the same 25% total solid content but varying anhydrate to monohydrate ratios (dashed line, 0:1; solid line, 0.5:0.5; dotted line, 1:0); (b) the Raman spectra for samples with an anhydrate to monohydrate ratio of 0.5:0.5 but varying total solid contents (red solid line, 5%; blue dashed line, 10%; green dotted line, 15%; black dashed-dotted line, 20%; cyan dashed-dotted-dotted line, 25%).
crystallizer decreased from 42 to 32 °C. The crystallization process was initiated through the introduction of anhydrous seed particles in the supersaturated solution. The seed crystals grew while secondary nucleation was taking place. FT-Raman spectra were collected during the crystallization process. Process 4 (a Two-Step Process). This process consists of two steps. The first step was the same anhydrate crystallization process as “Process 3”. It was then followed by the isothermal seeded anhydrate to monohydrate phase transition similar to “Process 2”. FT-Raman spectroscopy was used to collect the spectral data during the whole process. RESULTS AND DISCUSSION Figure 2a highlights the sensitivity of the Raman spectra between 1090 and 1286 cm-1 to the solid phase in the suspension samples. Distinctive peaks of both anhydrate (at around 1142 and 1208 cm-1) and monohydrate (at about 1104, 1165 and 1254 cm-1) can be readily observed. These Raman peaks are relatively sensitive to the changes in weight ratios between anhydrate and monohydrate, which is a preferable feature for calibration model 6662
Analytical Chemistry, Vol. 80, No. 17, September 1, 2008
building. Therefore, the Raman intensities between 1090 and 1286 cm-1 form the basis of the following data analysis. Beside the weight ratio between anhydrate and monohydrate, the overall solid content in the suspension samples also has significant influence on the Raman intensities. Figure 2b shows the Raman spectra of suspension samples with the same weight ratio between the anhydrate and monohydrate (0.5:0.5) but with different total solid contents. The increase of the overall solid content in the suspensions induces a corresponding increase of the height of the characteristic peaks of both the anhydrate and monohydrate. For example, the characteristic peak of monohydrate at about 1165 cm-1 is almost completely masked by the strong background Raman intensities from the solution phase and other possible interferences (e.g., fluorescence) when the overall solid content in the suspension sample is as low as 5%. Once the overall solid content increases to 10%, a small peak appears above the background, and this peak becomes more prominent when the overall solid content further increases to 25%. The influence of the solid content on Raman intensities will lead to confounding effects and hence large errors in the estimation of the concentrations of anhydrate (or monohydrate) particles in suspension samples if the effects are not corrected. Such confounding effects are difficult to remove by methods based on the band ratio between the peak heights/areas of anhydrate and monohydrate due to the presence of significant and varying background Raman contributions from the solution phase and other possible interferences (e.g., fluorescence) as seen in Figure 2b. As stated in the theoretical part of this paper (Theory section), the relationship between Raman intensities and the mass fractions of anhydrate (or monhydrate) in the suspension samples is nonlinear due to the existence of confounding effects of the overall solid content and/or particle size. It can be concluded, therefore, that multivariate linear calibration methods such as PLS are unsuitable for the analysis of Raman spectral data of suspension systems. To verify such speculation, the Raman spectra of the 25 calibration samples were divided into a training set and a test set with each set consisting of 20 and 4 samples, respectively (one sample was identified as an outlier and hence discarded). The four test samples are 0:1 at 5 wt %, 0.25:0.75 at 20 wt %, 0.75:0.25 at 10 wt %, and 1:0 at 25 wt %. To mitigate the temperature effects on the accuracy of predictions, the Raman spectra of two samples without solid phase were measured at 42 °C and also included in the training set. Calibration models between the Raman intensities and the mass fractions of anhydrate were built by PLS with meancentering. When building calibration models using the PLS method, one of the tricky tasks is deciding the optimal number of components used in the PLS model. Generally, it is determined by scrutinizing the root-mean-square error (RMSE) values of PLS models with different number of components. The RMSE values (in weight fraction) of PLS models for both training and test samples are shown in Figure 3. The RMSE value for the test samples reaches its minimum (0.077) when two underlying components were used in the calibration model. Including one more underlying component in the PLS calibration model leads to a further decrease in the RMSE value for training samples but an abrupt increase in the RMSE value for test samples. Such a phenomenon strongly suggests the optimal PLS model with two underlying components has not really captured the underlying
Figure 3. RMSE values for training (4) and test (O) samples for PLS (black dashed lines) and MEC (red solid lines) models with different underlying components, respectively.
relationship between the Raman intensities and anhydrate concentrations in the suspension samples, and its predictions would be unreliable. Parts a and b of Figure 4 show the predicted anhydrate concentrations for the training and test samples by PLS models with two and four underlying components, respectively. For the PLS model with two underlying components, the accuracy of the predictions for training samples is rather poor (RMSE ) 0.081), which means there are still some systematic variations left unmodelled by the two-component PLS calibration model. When the number of components used in the PLS calibration model increases to four, the accuracy of predictions for the training samples are greatly improved (RMSE ) 0.042); however, the price of such improvement is the significant deterioration of the accuracy of predictions for the test samples (RMSE ) 0.12). All these facts point to one conclusion, i.e., the nonlinear confounding effects of the solid content on the Raman intensities cannot be effectively handled by multivariate linear calibration methods such as PLS. Unlike PLS, which can only implicitly and partly model the multiplicative confounding effects through linear approximation, the MEC method proposed in this paper is specifically designed for the analysis of Raman spectral data contaminated with multiplicative confounding effects. The same training and test sets modeled by PLS were used to build MEC calibration models. MEC involves the estimation of the multiplicative parameter qk for each training sample by OPLEC.31 The application of OPLEC requires the determination of the actual number of spectral variation sources including chemical components and possible interference(s) in the Raman spectra of the training samples. The Raman intensities of the suspension system studied in this paper have contributions from two polymorphic forms (i.e., anhydrate and monohydrate) in solid phase, the dissolved solute, the solvent and possible fluorescence. Principal component analysis of the Raman spectra of the training set also reveals that there are five significant eigenvalues. So the number of spectral variation sources can be set to 5 for the implementation of OPLEC. After the calculation of the multiplicative parameter qk for each training sample, MEC models with different underlying components can be built. For the purpose of comparison, the predictive performance of the
Figure 4. The anhydrate concentrations for training (O) and test (2) samples predicted by PLS models with two (a) and four (b) underlying components, respectively. Diagonal line: theoretically correct predictions.
MEC calibration models for both the training and test sets are displayed alongside that of the PLS calibration models in the RMSE plots in Figure 3. It can be seen that the RMSE values of the MEC models for the training samples are consistently lower than those of the PLS models, when the number of underlying components included in the calibration models is less than six. It means that the MEC models can capture more systematic variations in Raman spectral data than do the PLS models with the same level of model complexity. The RMSE profile of MEC for the test samples also reflects the advantages of MEC over PLS in modeling confounding effects. The MEC calibration model with four underlying components attains the minimal RMSE value (0.031) for the test samples which is less than half of the minimal RMSE value of PLS models. A further increase in the number of underlying components included in the MEC calibration model offers no obvious improvement or deterioration in terms of the RMSE value for the test samples. So the optimal MEC model with four underlying components captures the real relationship between the Raman intensities and anhydrate concentrations in the suspension samples. It is, therefore, reasonable to expect that such a calibration model will provide robust predictions. Figure 5 provides the graphic presentation of the predictive performance Analytical Chemistry, Vol. 80, No. 17, September 1, 2008
6663
Figure 5. The anhydrate concentrations for training (O) and test (2) samples predicted by the MEC model with four underlying components. Diagonal line: theoretically correct predictions.
of the optimal MEC model for both the training and test samples. Compared with the predictions of the optimal PLS model shown in Figure 4a, the predictions of the optimal MEC model are much better in terms of both precision and accuracy. Since the ultimate goal of building calibration models between Raman intensities and anhydrate concentrations is to provide predictions from in situ FT-Raman measurements of phase transition processes, a more convincing comparison between the optimal MEC and PLS models should be based on their predictive performance for the four crystallization and phase transition processes. The anhydrate concentration profiles predicted by the optimal PLS and MEC models for the four processes are presented in Figure 6. For the phase transition process at 15 °C (“Process 1”), the profile predicted by MEC shows that the anhydrate concentration kept at the initial value of 100% for about 60 min and then quickly decreased to the end point value of about 4.0%. Though it is expected that the anhydrate concentration at the end point of the transition process should be zero, the prediction of about 4.0% is reasonable considering that the phase transition process was monitored by FT-Raman for only about 180 min, and the anhydrate might not have fully transformed into monohydrate. The PLS model gives a similar prediction to that of the MEC model for the initial anhydrate concentration in this process. However, the end point anhydrate concentration predicted by the PLS model is about -20%, which is a rather erroneous prediction. It illustrates that multiplicative confounding effects that have not been fully accounted for by the PLS model have a detrimental influence on the predictive performance of the model. Compared with the concentration profile for the phase transition process at 15 °C, the anhydrate concentration profile predicted by MEC for the phase transition process at 32 °C (“Process 2”) shows a similar evolving trend but with a higher transition rate. There is also some difference between the predictions of the MEC model for the initial anhydrate concentrations of these two processes. The difference might be caused by the temperature effects on the Raman intensities that have not been fully modeled by simply including two background spectra obtained at a temperature higher than 15 °C (i.e., Raman spectra of two blank samples without the solid phase) into the training set. Since there 6664
Analytical Chemistry, Vol. 80, No. 17, September 1, 2008
Figure 6. The anhydrate concentration profiles predicted by the optimal PLS (a) and MEC (b) models for the phase transition processes at 15 °C (blue dashed-dotted-dotted line) and 32 °C ((black dashed line), anhydrate crystallization process (red solid line), and the two-step process (green dotted line), respectively.
is no solid phase in the blank samples, the temperature effects on the Raman contributions of the solid phase can not be properly corrected. Furthermore, the solubilities of both anhydrate and monohydrate citric acid crystals at 32 °C are significantly higher than their corresponding solubilities at 15 °C, which may also affect the predictive accuracy of calibration models. However, there are currently no appropriate methods available to deal with the temperature problem for suspension systems. It should be noted that despite the possible interference from the temperature effects, the end point anhydrate concentration predicted by the MEC model for the phase transition process at 32 °C is very close to the corresponding value for the phase transition process at 15 °C. The initial and end point anhydrate concentrations provided by PLS for the phase transition process at 32 °C are about 89% and 9%, respectively, significantly different from the corresponding values (about 100% and -20%, respectively) for the phase transition process at 15 °C. The failure of PLS in providing similar predictions for these two similar processes reveals that the PLS model is very prone to changes in external process variables, resulting in unreliable predictions.
Theoretically, the anhydrate concentration of the crystallization process (“Process 3”) should remain at zero when no anhydrate crystals are present in the reactor. When the nucleation process starts, the anhydrate concentration should increase from zero to 100% instantly and stay at this concentration level thereafter, as during the whole crystallization process, despite the overall solid content evolution, there is no phase transition. The two-step process (“Process 4”) includes a crystallization process and a phase transition process, so the first part of the theoretical trajectory of the anhydrate concentration should be the same as for “Process 3”. When the phase transition process starts, the anhydrate concentration decreases from 100% to a value close to zero. The anhydrate concentration profiles predicted by the MEC model for both the crystallization process (“Process 3”) and the two-step process (“Process 4”) are quite impressive and successfully follow the theoretical trajectories. The predictions of the MEC model for the initial anhydrate concentrations of the two processes are about 0.3% and 2.5%, respectively. Considering the existence of unavoidable model discrepancies, which, in particular can be expected from the fact that very small amounts of solids are present in the reactor at the nucleation point, these are reasonably good predictions of the expected values of zero. The end point anhydrate concentration predicted by the MEC model for the twostep process (“Process 4”) approaches the corresponding values for the phase transition processes at both 15 and 32 °C. Such consistent predictive performance demonstrates the excellent robustness of the MEC calibration model. In contrast, the predictions of the PLS model for the initial anhydrate concentrations for both the crystallization (“Process 3”) and two-step processes (about 5% and 6%, respectively) deviate more significantly from the expected values of zero. Moreover, PLS gives a significant negative prediction (about -9%) for the end point anhydrate concentration of the two-step process. It is also worth pointing out that the anhydrate concentration profiles predicted by PLS for both the crystallization process (“Process 3”) and the two-step process (“Process 4”) show a slow rather than an expected instant transition from the initial value to the maximum value. This is further convincing evidence that the PLS model cannot effectively model the multiplicative confounding effects, and that the variation of the solid content during the crystallization processes greatly affects the reliability of PLS predictions.
samples such as particle size, overall solid concentration, and homogeneity of the solid phase. It was observed that traditional multivariate linear calibration methods such as PLS could not effectively separate the Raman contributions due to the changes in analyte concentration from those caused by the variations of samples’ physical properties. An advanced model has been derived to decompose the total Raman intensities into the linear combination of the contributions from solid phases, dissolved solute, solvent, and other possible interferences such as fluorescence. In this model, a multiplicative parameter is introduced for each sample to explicitly account for the confounding effects of the particle size and overall solid content of the solid phase on the Raman intensities. The multiplicative parameters for a set of training samples (the concentrations of the target analyte are known for these samples) can be estimated from their Raman spectra. Hence, the detrimental effects of the particle size and overall solid content of the solid phase on the Raman spectra of either the training set or future test samples can be removed through a unique calibration procedure. This strategy has been applied to the extraction of quantitative information on polymorphic forms from the in situ FT-Raman measurements made during the crystallization and phase transition processes of citric acid in water. Experimental results show this method can effectively correct for the confounding effects of particle size and overall solid content of the solid phase and therefore provide much more accurate in situ quantitative predictions for polymorphic forms than a PLS calibration method. It was also found that the temperature difference between the calibration samples on which the calibration model is built and the processes to which the calibration model is applied can affect the predictive performance of a calibration model. There are no appropriate methods available to deal with this problem for suspension systems. Our future work will focus on extension of the model proposed in this paper to account for the temperature effects on Raman intensities.
CONCLUSIONS The Raman intensity of suspension samples depends on not only the analyte concentration but also the physical properties of
Received for review May 14, 2008. Accepted June 30, 2008.
SUPPORTING INFORMATION AVAILABLE MATLAB code for the estimation of multiplicative parameters qk by OPLEC. This material is available free of charge via the Internet at http://pubs.acs.org.
AC800987M
Analytical Chemistry, Vol. 80, No. 17, September 1, 2008
6665