Optimization of Near-Infrared Spectroscopic Process Monitoring at

One such challenging process application is the NIR inline monitoring of a ... Furthermore, the NIR light is strongly attenuated due to thick windows ...
0 downloads 0 Views 4MB Size
ARTICLE pubs.acs.org/ac

Optimization of Near-Infrared Spectroscopic Process Monitoring at Low Signal-to-Noise Ratio Hendrik Schneider and Gabriele Reich* Department of Pharmaceutical Technology and Biopharmaceutics, University of Heidelberg, INF 366, 69120 Heidelberg, Germany ABSTRACT: An approach for the optimization of near-infrared (NIR) spectroscopic process monitoring at low signal-to-noise ratio is presented. It compromises the combined adjustment of different measurement variables and data pretreatments considering the prediction error, economic aspects of the application, and process constraints. The integration time, light intensity, and number of averaged spectra were varied; their mutual influence on the prediction error of partial least squares (PLS) models (i.e., rootmean-square error of cross-validation (RMSECV)) was evaluated in the laboratory. At low signal levels, the spectral uncertainty had a strong impact on the prediction error. It leveled off with increasing values of all three parameters and was finally dominated by other sources of uncertainty. The experimental findings could be characterized and explained by a mathematical equation, which was deduced from theoretical principles. The knowledge about the interaction of the measurement variables allowed their combined adjustment resulting in a reduced impact of spectral uncertainty on the prediction error (i.e., root-mean-square error of prediction (RMSEP)) without additional costs or process modifications. Moreover, a convenient procedure to compensate the stray light caused by strongly absorbing windows was developed. The whole approach was successfully applied to a challenging process, namely, the NIR inline monitoring of the liquid content of two model substances in a rotating suspension dryer.

ear-infrared (NIR) spectroscopy based on fiber optics is commonly used for fast and noninvasive process monitoring.1-5 The success of a NIR measurement depends on a number of variables, e.g., measurement parameters and data processing. Adjustment of several measurement variables to achieve a low prediction error is often not feasible without disadvantages for the process or higher costs. For example, a low integration time may be required in the case of a high number of samples to avoid a high total measurement time and related costs or in the case of a fast moving material.6,7 One such challenging process application is the NIR inline monitoring of a thermal drying process in a FSD suspension dryer (FIMA Maschinenbau GmbH, Obersontheim, Germany). The dryer combines different steps of solid-liquid separation and drying in one filter basket.8 During filter centrifugation, a cake of solid material is formed (Figure 1). The product is dried to the final liquid content by a drying gas. During this process step it can be advantageous to preserve the cake. For this purpose, the rotation of the drum has to be continuously maintained, resulting in high velocities of the drum during the analytical measurements, which are associated with low integration times. Furthermore, the NIR light is strongly attenuated due to thick windows of 50 mm. Hence, the measurements are affected by a high level of noise, i.e., spectral uncertainty. With dependence on the measurement variables, the prediction error of a measurement is influenced by different sources of uncertainty. The error propagation in multivariate calibration has been studied by various authors, who accounted for uncertainty in the reference values and spectra.9-13 For the experimental assessment of different magnitudes of spectral uncertainty, the

N

r 2011 American Chemical Society

focus was mainly on the addition of synthetic noise characteristics to real or artificial spectral data.14-16 This requires the knowledge about the potentially interfering sources of variance and its structure in order to approximate specific measurements. Few papers include results of real data with different noise levels at a low signalto-noise ratio (SNR). Greensill and Walsh17 performed calibrations of sucrose solutions at five levels of the coefficient of variation (CV) by varying the signal level and the number of scans. A precision plateau was reached for CVs below 0.022%. At a higher CV, calibration performance was significantly poorer. Hu et al.18 compared 5 different numbers of averaged scans and concluded that within the application 64 scans were “good enough for the spectral measurement” with no further improvement for more scans. The aim of our study was to develop an approach for the optimization of inline NIR measurements at low SNR and apply this approach to the process monitoring in the suspension dryer. Next to the NIR prediction error and the measuring frequency nonanalytical aspects of the application were to be considered, such as related costs and constraints by the process application (e.g., maximum rotation velocity during the measurements). For this purpose, the interactions of different measurement variables concerning their effect on the prediction error of NIR measurements were evaluated in the laboratory using real data with various noise levels. In addition, the type of data pretreatment was varied. For data interpretation, it was presumed that the measurement parameters influence the prediction error Received: November 18, 2010 Accepted: January 17, 2011 Published: February 15, 2011 2172

dx.doi.org/10.1021/ac103032w | Anal. Chem. 2011, 83, 2172–2178

Analytical Chemistry

ARTICLE

Figure 1. Suspension dryer with spectroscopic arrangement for process measurements.

(root-mean-square error of cross-validation (RMSECV)) primarily via the ratio of spectral uncertainty to net analyte signal. The influence of the measurement parameters was characterized by mathematical equations. Additionally, the application of thick windows was evaluated in the laboratory, and a procedure to reduce the undesirable effect of the resulting stray light was developed. Supported by theoretical considerations, the laboratory experiments were intended to allow various process variables to be adjusted in a combined way rather than focusing on each variable separately. Thus, an overall optimization of the inline monitoring of the liquid content in a rotating suspension dryer should be achieved.

’ THEORY In NIR spectroscopy, multivariate calibration leads to predictive models, which correlate analyte concentrations (or other target variables) with spectral responses. For inverse methods, only the knowledge about the concentration of the components of interest is necessary. The associated calibration model can be expressed as y ¼ Xb þ e ð1Þ where y (n x 1) is the vector of analyte concentrations, X (n x w) the matrix of spectral responses, b (w x 1) the regression vector, and e (n x 1) a vector of residuals, i.e., the unmodeled part. The number of wavelengths (w) often exceeds the number of samples (n). When the resulting under-determined system is solved, the information of many wavelengths can be projected on a lower number of new orthogonal factors, which are subsequently used for the regression. PLS regression incorporates a corresponding method. Various sources of uncertainty propagate into the prediction error (PE) of analytical measurements, i.e., into the difference between the predicted and true value. Faber and Kowalski derived an approximation for the variance of the PE, which corresponds to the expected squared prediction error E[PE2], for linear relationships, random and independent noise, and mean centered data:11,19 V ðPEÞ ¼ E½PE2   ðn-1 þ hÞðσe 2 þ σ Δy 2 þ b σ ΔX 2 Þ þ σe 2 þ b σΔX 2 ð2Þ 2

) )

) )

2

where σe2 is the variance of the residuals, σΔy2 the variance of calibration concentrations, and σΔX2 the variance of the spectra. The leverage h describes the relation of the test sample to the calibration space. The Euclidean norm of the regression vector b

can be expressed as the reciprocal of the sensitivity, i.e., the scalar of the net analyte signal (NAS) generated by an analyte concentration equal to unity.10 Hence, according to the equation, the expected squared PE increases with the uncertainty in the calibration model (first term), the variance of the residual of the sample (second term), and the ratio of the uncertainty in the sample spectra to NAS (last term). During this study, an equation was derived based on theoretical considerations and the approximation of Faber and Kowalski to describe the influence of the measurement parameters (integration time, light intensity, and number of averaged spectra) on the prediction error, namely, on the root-mean-square error (RMSE) of prediction or cross-validation. Under the assumption that no bias is included in the RMSE, it was approximated by the square root of the variance of the PE (eq 2): pffiffiffiffiffiffiffiffiffiffiffiffiffi RMSE  V ðPEÞ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  ffi   σΔX 2 2 2 2 -1 -1  ðn þ hÞ σe þ σΔy þ σe þ ðn þ h þ 1Þ NAS

ð3Þ The equation was simplified presuming that (regarding the application at the suspension dryer) the measurement parameters primarily influence the spectral uncertainty. Other effects, such as a larger sample volume of moving material at higher integration times, were neglected. Accordingly, all uncertainties, which affect the RMSE independent of the investigated measurement parameter, were summarized to the constant F0. In order to cover common noise characteristics, a term was introduced for spectral noise components, whose ratio of standard deviation to NAS decreases proportionally with the reciprocal of the measurement parameter (par) to the power of the variable p. Factors multiplied by this term were summarized with the factor Fpar: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi   1 2 2 RMSE  F0 þ Fpar p ð4Þ par Specific terms for several noise sources (σΔX2 = σΔX,12 þ σΔX,22 þ ...) would allow a more accurate presentation of the spectral noise characteristics. However, these were not introduced in order to maintain a higher convenience of the equation.

’ EXPERIMENTAL SECTION Variation of Measurement Variables. Radiation of a highpower tungsten halogen HL-2000-HP light source (Mikropack, Ostfildern, Germany) and a CLH 600 light source (Carl Zeiss, Jena, Germany) was coupled into a 6 or 7 fiber bundle, of a specially designed process reflection probe (Solvias, Basel, Switzerland). The total intensity of sample illumination was adjusted to six levels by using different combinations of source and bundle (Table 1). The light was transmitted through a 50 mm thick window (1st test series, borosilicate; 2nd test series, quartz GE124) to the sample, which was placed in a repository. The light was scattered back and guided by a single fiber of the probe to a MCS611NIR2.0 diode-array spectrometer (Carl Zeiss, Jena, Germany). Spectra (1310-1950 nm, 10 nm spectral resolution, 229 data points) of two model substances, i.e., calcium carbonate (KSL Staubtechnik, Lauingen, Germany) and lactose monohydrate (Lactochem, Domo, Zwolle, Netherlands) with different amounts 2173

dx.doi.org/10.1021/ac103032w |Anal. Chem. 2011, 83, 2172–2178

Analytical Chemistry

ARTICLE

Table 1. Measurement Variables and Data Pretreatments Used for Laboratory Experimentsa light intensity (normalized)

1st test series: 1.0, 2.8, 8.3, 11.1, 14.7, 15.8 2nd test series: 14.7

integration time

1st test series: 6, 10.5, 21, 42, 168, and 420 ms 2nd test series: 2.5, 5, 10, 20, 100, and 200 ms

averaged spectra

1st test series: 1, 2, 5, 20, 80, 140 2nd test series: 1, 2, 5, 10, 20, 40

data pretreatment

2nd derivative (Savitzky-Golay: 2nd polynomial order, window size 9), multiplicative scatter correction (MSC), mean normalization

a

The italic numbers of averaged spectra were not collected at all integration times.

of water and ethanol (Karl-Josef Kost, Koblenz, Germany; denatured with methyl ethyl ketone) were collected. The integration time and the number of averaged spectra were varied resulting in different sets of spectra for each sample set. The signal intensities covered a wide range depending on the parameters (maximum intensity of each spectrum between 20 and 51 000 counts). The sample spectra were referenced by the ratio to the spectrum of a poly(tetrafluoroethylene) (PTFE) disk (final form, 1/R). The reference spectrum was updated approximately every hour. Subsequently, different common data pretreatments were applied (Table 1) and partial least squares (PLS) regressions of the liquid content were calculated (Unscrambler 8.0 and 9.8, Camo, Oslo, Norway). The influence of the different measurement variables on the ratio of the root-mean-square noise to the average signal (NSR) and on the prediction error was examined and characterized mathematically (eq 4). The root-mean-square noise of the unreferenced spectra was calculated analogically to the standard deviation using the difference to the intensity of a reference spectrum of 20-140 averaged spectra (depending on the set of compared spectra) at each wavelength. The root-mean-square error of cross-validation (full cross validation, sporadic outliers were omitted) was used as an indicator for the prediction error. Stray Light. Because of the thick windows of 50 mm, a high percentage of stray light was expected to occur. Its effect was evaluated in the laboratory. The stray light was approximated by spectra of a water filled container and by spectra in a dark room of the empty container with a removed back plane. To compensate its effect, the spectrum of the water filled container was subtracted from the sample spectra and the reference spectrum prior to further data processing. The proposed procedure was tested using laboratory spectra of the model substances at different fill levels of the illuminated area and process spectra collected in the rotating suspension dryer. Process Measurements in Suspension Dryer. On the basis of the laboratory experiments, a near-infrared system for the measurement of the liquid content in the suspension dryer was set up and the parameters were optimized. The arrangement was similar to the laboratory experiments using the HL-2000HP high power light source and the seven fiber bundle of the probe (Figure 1). The quartz window was embedded in the seal plate and rotated with the drum, while the fiber probe was mounted in a fixed position in front of the plate. Whenever the window was moving past the probe, the collection of a spectrum (1310-1950 nm, 10 nm spectral resolution, 229 data points) was triggered by the programmable controller of the dryer. Associated with the applied speed of rotation during the process, a low integration time of 14 ms resulted. For test measurements with the model substances, 20 spectra were averaged after transferring to the computer. Reference spectra of a PTFE disk were collected before every process

run. For calibration modeling, spectra of samples collected at 200 ms (40 averaged spectra) in the laboratory were used. After compensation for the stray light and performance of the data pretreatments, PLS regressions were calculated. For each test measurement (i.e., model validation), the liquid content in the suspension dryer was predicted during rotation. Then, the process was stopped, a reference sample near the window was collected, and the liquid content was quantified gravimetrically; for water/ethanol mixtures both gravimetric measurements and Karl Fischer titration were used. In addition, various drying processes of the model substances in the suspension dryer were monitored.

’ RESULTS AND DISCUSSION Spectral Uncertainty. The laboratory experiments revealed high-frequency fluctuations next to slight baseline-offsets. The fluctuations were visually attributed as homoscedastic. The corresponding root-mean-square noise decreased linearly with the reciprocal of the square root of the number of averaged spectra. The noise changed only marginally with increasing light intensity. It increased strongly with the integration time (e.g., ratio of mean of values at 2.5 and 200 ms was 2.6 with standard deviation of 0.3) comprising a y-intercept. The observations indicated, that read noise and thermal noise were major sources for these highfrequency fluctuations. Shot noise, which would increase with the signal intensity, was only a minor source. Additionally, a baseline-offset with a constant absolute value occurred randomly between the spectra of the same sample, even when several spectra had been averaged by the spectrometer unit before transfer to the computer (Figure 2). Except for high signal levels, the offset and its absolute value were independent of the light intensity and the integration time. Hence its value was below 0.1% of high signal levels, but at low signal levels the percentage increased. The origin of this discrete baseline-offset is currently unknown but is suspected to result from instrumental effects. It may also occur at other spectrometers. During the laboratory experiments, several spectra of one sample were collected without any alteration of the experimental setup. Therefore the baseline-offset could be identified and quantified by calculating the difference between these spectra. Furthermore, the offset could be corrected with an algorithm implemented in an Aspect Plus (Carl Zeiss, Jena, Germany) macro. After subtraction of the mean of one spectrum from the mean of the others by the algorithm, the resulting values had two different levels. The predefined constant offset was assigned to the spectra accordingly. For the subsequent processing of the spectra three strategies were distinguished. Strategy I: Averaging of spectra was performed by the spectrometer; the spectra were not changed by the algorithm. Hence, the constant baseline-offset occurred 2174

dx.doi.org/10.1021/ac103032w |Anal. Chem. 2011, 83, 2172–2178

Analytical Chemistry

ARTICLE

Figure 2. Five spectra of the same sample at 2.5 ms integration time: An averaging between 1 and 20 single spectra was performed by the spectrometer unit for each of them.

Table 2. Mathematical Characterization of the Influence of Measurement Parameters on NSRa

Figure 3. Influence of integration time on RMSECV of PLS regressions of ethanol content of lactose at a different number of averaged spectra (strategy II, 3PCs, 27 samples, 0-20% liquid content, normalization; assuming normally distributed prediction errors and therefore a χ2 distribution of the mean square error of prediction, the relative uncertainty in the RMSECV can be estimated to be 14%19).

integration time averaged spectra

low

strategy I

0.0

1.0

strategy II

0.5

0.9

strategy III

0.4

1.0

high

light intensity 0.8 0.9

0.6

1.0

By curve fitting, a linear relationship with the expression 1/parameterd was observed. The values of the variable d are given in the table. For highfrequency fluctuations (strategy III) the d-value of the integration time was smaller at high parameter values than at low values, where the total noise was dominated by the y-intercept. For the noise calculation, the constant baseline-offset of the reference spectra was corrected. a

randomly and independent of the number of averaged spectra (Figure 2). Strategy II: With the use of the algorithm, the baseline-offset of each spectrum was partially corrected. Each new baseline-offset was calculated as the arithmetic mean of the assigned offsetvalues of unprocessed spectra, whose number corresponded to the number of “averaged spectra”. During processing of the spectra by the algorithm, each old offset was subtracted and the new added. It is believed that the spectral noise will also show the resulting characteristics, if the total spectra are averaged after transferring them to the computer. Strategy III: With the use of the algorithm, the assigned constant offset was completely corrected by its subtraction from each spectrum resulting in mainly high frequency fluctuations at all spectra. It is believed that the spectral noise will also show the resulting characteristics, if the offset does not occur at all. For the prediction error, the ratio of spectral uncertainty to net analyte signal is important. Its characteristics were approximated by the ratio of the root-mean-square noise to the average signal (NSR). The signal intensity increased proportionally with the integration time and the light intensity but not with the number of averaged spectra. Hence, implying the above-described noise characteristics, the NSR decreased less with the number of averaged spectra than with the light intensity or the integration time (Table 2). Effect of Measurement Variables on RMSECV. The prediction error, i.e., the RMSECV with the same unit as the sample values, decreased strongly with both increasing integration time (Figure 3) and light intensity. It leveled off to a plateau in a region of higher values of the parameters. The effect of an increased

Figure 4. Influence of light intensity and integration time on RMSECV of PLS regressions of water content of calcium carbonate after normalization without averaging of spectra using a borosilicate window (3PCs, 2 sets of 13 samples, 0.6-22.6% water content; the relative uncertainty in the RMSECV can be estimated to be 20%19).

number of averaged spectra was less pronounced, especially when the averaging was performed by the spectrometer (strategy I). Considering two parameters simultaneously, two-dimensional plateaus were observed (Figure 4). In most cases, the lowest RMSECV in the plateau region was observed after MSC (regions of high absorption at wavelengths 1377-1596 nm and above 1836 nm for water and 1670-1781 nm for ethanol were omitted; first overtone and combination of water, first overtone of CHbonds20) as data pretreatment; at lower values of light intensity, integration time and number of averaged spectra (Figure 5) RMSECVs were lowest after MSC and mean normalization. With the use of Savitzky-Golay differentiation, the overall RMSECV was higher; the decrease with the integration time was less pronounced. The occurrence of the discrete baseline-offset led to an increase of the RMSECV with a reduced plateau area (data not shown). The effect of the offset could not be fully compensated by mean normalization or MSC. Generally, the latter allows a correction of additive effects, which have a constant value over all wavelengths. However, the offset values of the wavelengths, 2175

dx.doi.org/10.1021/ac103032w |Anal. Chem. 2011, 83, 2172–2178

Analytical Chemistry

ARTICLE

Figure 5. Influence of integration time and number of averaged spectra on RMSECV of PLS regressions of water content of calcium carbonate after normalization (left), differentiation (middle), and MSC (right) using a quartz window (strategy II, 3PCs, set of 14 samples at each data point, 0.2-5.1%; the relative uncertainty in RMSECV can be estimated to be 19%19).

Table 3. Mathematical Characterization of Influence of Parameters on RMSECVa averaged spectra b

integration time 1.1b

strategy I

normalization/MSC

0.1

SG

0.3

0.5

strategy II

normalization/MSC SG

0.6b 0.5

1.3b 0.5

strategy III

normalization/MSC

0.4b

1.2b

SG

0.3

0.5

By curve fitting, values of the variable p of eq 4 were found. The water content in calcium carbonate was predicted after normalization, MSC, and SG differentiation. b The values after normalization and MSC were similar and averaged. a

which were identical for each spectrum, were altered by referencing with a PTFE spectrum before pretreatment by MSC. For other applications with a low signal level, the occurrence of an offset should be investigated and the measurement procedure adapted, e.g., by external averaging of the spectra (strategy II). The occurrence of a plateau of the prediction error was consistent with experimental results of other applications.17 A more detailed analysis of the relation between the RMSECV and the parameters studied revealed that it could be approximated by eq 4 (Figure 3). Additionally, the values of the variable d for characterization of the influence of the measurement parameters on the NSR (Table 2) generally correlated with the values of the variable p of eq 4 for characterization of their influence on the RMSECV after normalization or MSC (Table 3). This indicated the accordance of the experimental results with the theoretical considerations and with the presumption that the measurement parameters influence the prediction error primarily via the ratio of spectral uncertainty to net analyte signal. The plateau area was likely dominated by sources of uncertainty or error, which affected the RMSECV independent of the investigated parameter (eq 4, F0). An example is the inaccuracy of the reference data. At low parameter values, the spectral uncertainty dominated and the RMSECV was governed by the characteristics of the NSR. The dominance of the different sources of uncertainty in the respective regions was enhanced as a consequence of error propagation calculations (squares effect).

Figure 6. Spectra of different amounts of lactose and water filled container (left), the same spectra after mean normalization (right, gray), and after compensation of stray light before mean normalization (right, black).

Data pretreatments have different impacts on undesirable spectral variations, targeted spectral features, and finally also on the prediction error.21,22 It is likely, because of mathematical reasons, the ratio of spectral noise to analyte signal increased by differentiation. Hence, the prediction errors were higher after differentiation. In addition, the importance of the high-frequency fluctuation increased relative to the baseline offset. The less pronounced decrease of the RMSECV with the integration time after differentiation can be explained by its occurrence in a region of high parameter values, where the slope of the NSR was also lower (Table 2). Additionally, high values of the RMSECV after differentiation were limited in their further increase because they approximated a RMSECV level, which was maximal for the sample set. The noise in particular affected the calculation of the second and third principal component (PLS component) of the regression for water content determination (data not shown). Hence the PLS model, which is based on linear assumptions, probably lost its ability to reflect nonlinear relations by using additional principal components.23 These nonlinearities can arise from different bonding interactions of water with the sample matrix and/or additional affects. Accordingly, the bias of the samples with low and high moisture content increased with increasing noise level. Stray Light. A high attenuation occurred due to the borosilicate windows (>97%) or quartz windows (>92%). Therefore a 2176

dx.doi.org/10.1021/ac103032w |Anal. Chem. 2011, 83, 2172–2178

Analytical Chemistry

ARTICLE

Table 4. Test Measurements in the Suspension Dryera laboratory calibration substance

samples

no. of drying runs

CaCO3: MSC, 3PCs, 1 outlier

23 (0-6%)

12

lactose: EMSC, 2PCs

25 (0-23%)

RMSEP

8

corr

process calibration RMSECV (%)

corr

ethanol: 0.48%

0.96

0.37

0.94

water: 0.43%

0.98

0.42

0.97

ethanol: 2.11%

0.93

2.43

0.91

a

Prediction error and correlation coefficient (corr) of test measurements at 14 ms illumination time using a calibration with samples of laboratory or process measurements.

Figure 7. Predicted vs reference liquid content of calcium carbonate test samples after MSC: Spectra were collected at 14 ms illumination time. Error bars indicate the deviation, which is calculated as a function of the sample’s leverage, its x-residual variance, and the global model error.

high percentage of the detected light did not interact with the sample but was scattered back by other optical inhomogeneities within the path of the light. The spectra of the empty and water filled repository, which were used to approximate this light, were similar to each other and did not change a lot during the experiments. The laboratory spectra of the same sample material revealed different overall intensities due to the varying amount of material (Figure 6). Hence, their spectral features represented a different percentage of the spectra due to the stray light. The spectra were biased after mean normalization. Since the stray light did not have a constant value over the wavelengths, the effect could not be fully compensated by MSC. Application of the proposed stray light correction strongly reduced the difference between the spectra (Figure 6). Accordingly, the root-mean-square error of prediction (RMSEP) of water/ethanol content of calcium carbonate process test samples improved by 12% (relative value) in average after MSCs and 17% (relative value) after mean normalization. The proposed procedure using the spectrum of a water-filled sample container or process area is convenient to perform. It is expected to be robust, since defined mathematical operations and spectra with low spectral noise are used. Process Measurements in Suspension Dryer. Facilitated by the results of the laboratory experiments, a combined adjustment of the measurement variables could be performed to optimize process monitoring in the suspension dryer. Despite the challenging requirements of the application, the combination (see the Experimental Section) allowed a low prediction error to be achieved due to a small effect of the spectral uncertainty. At the same time, rotation velocities of the filter drum of up to 120 rpm, a high measuring frequency, and the usage of commercial equipment were feasible. Light intensity could have been increased only at higher costs for another light source or another process window.

Figure 8. Monitoring of a drying process of calcium carbonate with a mixture of water and ethanol. Error bars indicate the deviation, which is calculated as a function of the sample’s leverage, its x-residual variance, and the global model error.

The water and ethanol content of the test measurements during several drying runs could reproducibly be predicted with low prediction error (Table 4). For the prediction of the water and ethanol content of calcium carbonate test samples, the best results were obtained after MSC (wavelengths 1373-1631, 1675-1786, and above 1843 nm were omitted) as the data pretreatment. This was attributed to a stronger ability of MSC for compensating physical effects compared to normalization. For prediction of the ethanol content of lactose, the best results were obtained after extended MSC. Regression models based on process samples performed better than laboratory calibrations only after differentiation and normalization. Similar prediction errors were observed after MSC, with one exception, namely, the prediction of the ethanol content of calcium carbonate. In this case, the higher predicted liquid content of most samples compared to the reference values (Figure 7) can be explained by a high drying rate of ethanol and a long duration for the sample collection resulting in biased reference values. Hence, in association with similar arrangements for the laboratory and process measurements and the usage of a data pretreatment with a strong compensation for physical effects, the laboratory calibration could be used including a less laborious and more precise sampling. Drying processes in the rotating suspension dryer were monitored successfully (Figure 8; additional data has been published8). The inline measurements could be used to control the process and to improve process understanding.

’ CONCLUSIONS The study provided experimental proof that the influence of the spectral uncertainty at low SNR on the prediction error of NIR measurements can be dominant. An increase in the integration 2177

dx.doi.org/10.1021/ac103032w |Anal. Chem. 2011, 83, 2172–2178

Analytical Chemistry time, the light intensity, and the number of averaged spectra strongly reduced the prediction error down to a value where it leveled off. The spectral noise characteristics revealed that above this threshold value, the light intensity and integration time both were a means for optimization of the prediction error. Below the threshold value, these variables led only to a marginal improvement, since other sources of uncertainty were dominant. The gained knowledge about the interactions of different measurement variables concerning their effect on the prediction error can facilitate their combined adjustment in order to find an advantageous compromise between analytical aspects of an application, related costs, and process modifications. This has been shown successfully during inline monitoring of the liquid content of two model substances in a rotating suspension dryer. Despite challenging conditions, a high measuring frequency and a low prediction error could be obtained by a combined adjustment of measurement parameters, which resulted in a minor effect of the spectral uncertainty. Disadvantages to the application, such as a lower rotation velocity of the suspension dryer, could be avoided. Thus, an overall optimization of the application was achieved. The presented approach may be transferred to other applications. Furthermore, the experimental results can facilitate the understanding and improvement of the prediction accuracy of multivariate models.

ARTICLE

(17) Greensill, C. V.; Walsh, K. B. Appl. Spectrosc. 2000, 54, 426–430. (18) Hu, Y.; Li, B.; Wang, H.; Osaki, K.; Ozaki, Y. J. Near Infrared Spectrosc. 2006, 14, 103–110. (19) Faber, N. M.; Duewer, D. L.; Choquette, S. J.; Green, T. L.; Chesler, S. N. Anal. Chem. 1998, 70, 2972–2982. (20) Siesler, H. W., Ozaki, Y., Kawata, S., Heise, M. H., Eds. NearInfrared Spectroscopy: Principles, Instruments, Applications; Wiley-VCH: Weinheim, Germany, 2002. (21) Boelens, H. F. M.; Kok, W. T.; de Noord, O. E.; Smilde, A. K. Anal. Chem. 2004, 76, 2656–2663. (22) Faber, N. M. Anal. Chem. 1999, 71, 557–565. (23) Blanco, M.; Coello, J.; Iturriaga, H.; Maspoch, S.; Pages, J. Chemom. Intell. Lab. 2000, 50, 75–82.

’ AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Fax: þ49(6221)545971.

’ ACKNOWLEDGMENT We acknowledge the collaboration with FIMA Maschinenbau GmbH (Obersontheim, Germany) and the kind support of Solvias AG (Basel, Switzerland) and Carl Zeiss MicroImaging GmbH (Jena, Germany). We thank the AiF for financial support. ’ REFERENCES (1) Workman, J. J. Appl. Spectrosc. Rev. 1999, 34, 1–89. (2) Chalmers, J. M., Ed. Spectroscopy in Process Analysis; Sheffield Academic Press Ltd.: Sheffield, England, 2000. (3) Roggo, Y.; Chalus, P.; Maurer, L.; Lema-Martinez, C.; Edmond, A.; Jent, N. J. Pharm. Biomed. Anal. 2007, 44, 683–700. (4) Ciurczak, E. W.; Drennen, J. K., III. Pharmaceutical and Medical Applications of Near-Infrared Spectroscopy; Marcel Dekker: New York, 2002. (5) Reich, G. Adv. Drug Delivery Rev. 2005, 57, 1109–1143. (6) Walsh, K. B.; Golic, M.; Greensill, C. V. J. Near Infrared Spectrosc. 2004, 12, 141–148. (7) Subedi, P.; Walsh, K. Potato Res. 2009, 52, 67–77. (8) Schneider, H.; Reich, G. Pharm. Ind. 2011, in press. (9) Nadler, B.; Coifman, R. R. J. Chemom. 2005, 19, 107–118. (10) Olivieri, A. C.; Faber, N. M.; Ferre, J.; Boque, R.; Kalivas, J. H.; Mark, H. Pure Appl. Chem. 2006, 78, 633–661. (11) Faber, K.; Kowalski, B. R. J. Chemom. 1997, 11, 181–238. (12) Leger, M. N.; Vega-Montoto, L.; Wentzell, P. D. Chemom. Intell. Lab. Syst. 2005, 77, 181–205. (13) Wolthuis, R.; Tjiang, G. C. H.; Puppels, G. J.; Schut, T. C. B. J. Raman Spectrosc. 2006, 37, 447–466. (14) Bhatt, N. P.; Mitna, A.; Narasimhan, S. Chemom. Intell. Lab. 2007, 85, 70–81. (15) Wentzell, P. D.; Vega Montoto, L. Chemom. Intell. Lab. 2003, 65, 257–279. (16) Lu, J.; McClure, W. F.; Barton, F. E., II; Himmelsbach, D. S. J. Near Infrared Spectrosc. 1998, 6, 77–87. 2178

dx.doi.org/10.1021/ac103032w |Anal. Chem. 2011, 83, 2172–2178