Modelling of the Batch Sucrose Crystallization Kinetics Using Artificial

Jun 11, 2008 - also found to be precise in predicting the crystal growth rate for the new input data, which are kept unaware of the trained neural net...
0 downloads 0 Views 213KB Size
Ind. Eng. Chem. Res. 2008, 47, 4917–4923

4917

Modelling of the Batch Sucrose Crystallization Kinetics Using Artificial Neural Networks: Comparison with Conventional Regression Analysis K. Vasanth Kumar,† P. Martins,‡ and F. Rocha*,† Departamento de Engenharia Química, Faculdade de Engenharia, and Grupo de Estrutura Molecular, IBMC-Instituto de Biologia Molecular e Celular, UniVersidade do Porto, Portugal

A three-layer feed-forward artificial neural network (ANN) was constructed and tested to analyze the crystal growth rate of sucrose under different operating conditions. The operating variables studied were used as inputs to predict the corresponding crystal growth rate. The operating variables studied include the supersaturation, temperature, agitation speed, and seed crystal diameter. The constructed ANN was determined to be precise in modeling the crystal growth rate for any operating conditions. The constructed network was also found to be precise in predicting the crystal growth rate for the new input data, which are kept unaware of the trained neural network, showing its applicability to determine the growth rate for any operating conditions of interest. The ANN-predicted crystal growth rates were compared to those from the conventional nonlinear regression analysis. The ANN was observed to be more accurate in predicting the crystal growth rate, irrespective of the operating conditions studied. The correlation coefficients between the experimentally determined crystal growth rate and the crystal growth rates determined by the ANN and multiple nonlinear regression (MNLR) were determined to be 0.999 and 0.748, respectively. The correlation coefficient between the experimentally determined crystal growth rates and the crystal growth rates determined by the ANN for new inputs was observed to be >0.98. 1. Introduction The kinetics of crystal growth from aqueous solution is a very complex process, because of the multiple steps involved. The crystal growth process is due to the mass transfer of solute from the bulk solution to the film by diffusion, followed by the surface integration (reaction) step. The kinetics of the crystal growth process is usually modeled by a two-step crystal growth model or by a semi-empirical expression, which is so-called the overall growth kinetics. The two-step crystal growth model was observed to represent the kinetics of some crystal growth processes very well. However, any attempt to use this model to simulate the crystallizer for any of the operating conditions studied is not easy. Although the overall growth kinetics can be useful in simulating the crystal growth process, the generalization of an expression correlating all the involved operating variables is not possible. This is because of the change in the order of reaction kinetics while using the overall growth kinetic expression. In addition, the empirical overall growth kinetic model can lead to large errors, especially at lower supersaturation. Currently artificial neural networks (ANNs) are observed to be an excellent option for solving these complex problems. ANNs have been successfully applied in many fields, including character recognition, speech recognition, image processing, and stock performance prediction.1 In chemical engineering, ANNs have been successfully applied to predict the adsorption equilibrium of solid–liquid systems,2 the activity coefficients of aromatic organic compounds,1 estimation of the water content of natural gas,3 and the solubility of proteins.4 Only few studies were focused on using ANNs to predict the crystal growth kinetics from solution. Yang and Wei5 successfully applied the ANN to predict the crystallization kinetics and the agglomeration * To whom correspondence should be addressed. Tel.: +351 22 508 1678. Fax: +351 22 508 1449. E-mail: [email protected]. † Departamento de Engenharia Química, Faculdade de Engenharia. ‡ Grupo de Estrutura Molecular, IBMC-Instituto de Biologia Molecular e Celular, Universidade do Porto.

coefficient during the process of crystallization of ciproflaxacin. They further reported that the ANN model is better than the multiple linear regression technique. Noever et al.6 reported the applicability of the ANN technique to predict the (110) face growth rate of tetragonal lysozyme. Georgieva et al.7 applied a hybrid model that combined a partial mechanistic model with a neural network to an industrial-scale batch evaporative crystallization process in cane sugar refining. They found a better agreement between the experimental data and hybrid model predictions than that observed with the mechanistic model. The growth of sucrose crystals in solutions is a complex function of different operating variables such as the seed crystal diameter, agitation speed, temperature, and supersaturation. In the present investigation, the applicability of artificial neural networks in predicting the growth rate of pure sucrose crystals was studied, taking into account the effect of the operating variables of the crystallization system. ANNs are used to correlate the complex relationship between the input and output of any process, irrespective of the physical meaning of the system. ANNs consist of input and output layers connected by several nodes. In the present study, a multiple layer feed-forward or back-propagation network with multiple layers was constructed. A Levenberg–Marquardt optimization algorithm was used to train the ANN. The feed-forward ANN adjusts the transfer functions that are associated with the inputs and outputs until the convergence is reached. The constructed network was tested with new inputs, which are kept unaware of the neural network, to check the applicability of the network in regard to predicting the crystal growth rate. 2. Experimental Section The growth of sucrose crystals was conducted in a 4 L batch agitated crystallizer at three different temperatures: 30, 40, and 50 °C. The operating variables studied were the agitated speed, temperature, supersaturation, and the seed crystal size. The crystallizer was connected to the online monitoring system for

10.1021/ie701706v CCC: $40.75  2008 American Chemical Society Published on Web 06/11/2008

4918 Ind. Eng. Chem. Res., Vol. 47, No. 14, 2008

Figure 1. Schematic of the batch crystallizer. Legend: R, refractometer; s, seeds; A, agitator; and t, thermocouple.

continuous monitoring of brix, which is defined by the mass percentage of dissolved solids in solution, and temperature, as shown in Figure 1. The temperature inside the crystallizer was maintained by a crystallizer jacket, which is connected to a thermostatic water bath. Unless specified agitation inside the crystallizer was maintained at a constant agitation speed of 250 rpm. Sucrose solutions were created by dissolving sucrose crystals in ultrapure water at Tw ) +20 °C. Tw is the working temperature. Supersaturation was obtained by cooling the sucrose solution to the working temperature. All the experiments started at an initial supersaturation of 20 g of sucrose/100 g of water. After the crystallizer temperature was stable, an accurately weighed 16 g amount of sucrose seed crystals was added into the crystallizer. Unless specified, the crystal growth experiments were performed with seed crystals 0.05362 cm in mean size. To study the effect of seed crystal diameter on the rate of crystal growth, four different size fractions of particle mean sizes; 0.01885, 0.05362, 0.05498, and 0.07088 cm;were used. The crystal growth experiments were performed for 24-72 h, based on the working temperature. After 24 h, the solution reaches a supersaturation of 7 g of sucrose/100 g of water. The mass of the crystals inside the crystallizer at any time was calculated from mass balance. The crystal growth kinetics was estimated from the change in mass of crystals (∆m), with respect to any time interval ∆t. For any time interval ∆t, the rate of crystal growth (Rg) can be given by8,9 Rg ) 3

(FcR)2⁄3 ∆m1⁄3 N1⁄3β ∆t

(1)

where R and β are the volume and surface area shape factors, N is the number of crystals, and Fc is the crystal density. The kinetic parameters were estimated based on the Rg values corresponding to the supersaturation changing from 20 g of sucrose/100 g of water to 7 g of sucrose/100 g of water. The physical constants of sucrose crystals used in eq 1 were given by9,10 β ) 4.4856 cm2 g-2⁄3 2⁄3 (FcR)

(2)

The number of growing crystals was assumed to be constant and was predicted based on the mass of seeds introduced into the crystallizer (m0) and on their mean size (L0):10 N)

m0 0.75FcL30

(3)

Table 1. Range of Operating Variables Used To Train the Network operating variable supersaturation, ∆c (g of sucrose/100 g of water) seed crystal diameter (cm) agitation speed (rpm) temperature (°C)

range 20 to 7 0.01885, 0.05362, 0.05498, and 0.07088 400, 350, 300, 250, 200, and 150 30, 40, and 50

The range of operating variables studied in the present investigation to train and test the ANN is given in Table 1. 3. Regression Analysis In the literature, the operating variables that affect the crystal growth rate were usually modeled using empirical expressions that were determined by conventional regression analysis.5,6 However, the error difference between the calculated crystal growth rate using the empirical correlation and the crystal growth rate from the experiments was very high.5,6 In the case of crystallization of ciproflaxicin, the error difference between the crystal growth rate from the experiments and from the regressed empirical expression varied by 20%.5 The correlation coefficient between the experimentally determined face growth rate of lysozyme and empirical expression was 0.76.6 In the present study, the experimental crystal growth data were correlated with the input variables using the conventional multiple nonlinear regression (MNLR) analysis. For MNLR, a trial-and-error procedure that is applicable to computer operation was used to determine the empirical expression that correlated the crystal growth rate with the operating variables studied. This was done by minimizing the error distribution between the experimental crystal growth rate data and crystal growth rate determined from empirical expression. The error distribution was minimized by maximizing the coefficient of determination (r2), using the Solver add-in with the Microsoft Excel spreadsheet program. This coefficient was defined as r2 )

(Rg,empirical expression - Rg,Experimental)2 ∑ [(Rg,empirical expression - Rg,Experimental)2 +

(4)

(Rg,empirical expression - Rg,Experimental)2] Figure 2 shows the plot of the crystal growth rate determined from the experiments and the crystal growth rate determined by the trial-and-error MNLR analysis. The data in Figure 2 refer only to the dataset that is used to train the network. From Figure 2, it can be observed that the MNLR very poorly predicts the crystal growth rate of sucrose with an r2 value of 0.748. The predicted data in Figure 2 by MNLR analysis fits eq 5:

Ind. Eng. Chem. Res., Vol. 47, No. 14, 2008 4919

(

Rg ) 1.85 × 10-7∆c2.09V0.8388d0.3323 exp p

9674.4 RT

)

(5)

where ∆c is the supersaturation, V the agitation speed, dp the initial seed size, R the gas constant, and T the absolute temperature. The low r2 value of 0.748 by MNLR suggests that this process would be inaccurate in regard to predicting the growth rate of sucrose crystals. Thus, in an attempt to improve the correlations between the operating variables and the growth of sucrose crystals, a new ANN model was constructed, and it was trained and tested using the data obtained in crystal growth experiments for different operating variables. 4. Neural Network Modeling In the present study, a multiple-layer feed-forward network with four input layers, one hidden layer, and one output layer was constructed initially, to simulate the growth of sucrose crystals. Multiple-layer feed-forward networks allow signals to flow only in one direction (i.e., from input to output) and can approximate any function very well for the given input conditions. Multiple-layer networks can perform any linear or nonlinear computation and can approximate any function reasonably well. The detailed structure of the network and the training strategy of the constructed neural network are shown in Figures 3 and 4, respectively, where P1 is the input vector to the hidden layer, and W1 and b1 represent the weight and bias of the hidden layer. The information from the hidden layer is transferred to the output layer, as shown in Figure 3. The term P2 represents the output vector and can be determined from the weight W2 and bias b2 of the output layer. In the present study, a tansig function and a purelin function were used as the

Figure 2. Parity plot between the experimentally obtained crystal growth rate and crystal growth rate predicted by conventional regression analysis.

Figure 3. Structure of the constructed two-layer network and the flow of information within the network.

propagation functions in the hidden layer and in the output layer, respectively. The training strategy of the network is shown in Figure 4. As shown in Figure 4, the input vectors and the corresponding output vectors are used to train the network until it approximates the propagation function. The proposed network with a tansigmoid hidden layer and a linear output layer was determined to be capable of approximating the crystal growth process. Thus, the bias and the weights can be obtained from the training procedure, which is based on the experimental data. In the present study, the supersaturation, temperature, agitation speed, and the initial seed crystal diameter were used as input vectors, whereas the growth rate was defined as the output vector. The neural network toolbox Version 4 of MATLAB (Mathworks, Inc.) was used for simulation. (A text file that contains the Matlab data is given in the Supporting Information.) The experimental conditions and the corresponding experimentally determined crystal growth rates were set as the input and the target vectors, respectively. The input vectors (experimental conditions) and the target vector were normalized before the training process, such that they fall in the interval of 0-1, so that their standard deviation and mean will be below the value of 1. The neural network was trained in batch mode. The training was made using the Levenberg–Marquardt training strategy. The incorporation of Marquardt’s algorithm into the back-propogation algorithm to train feed-forward networks was explained elsewhere.11 The training of the neural networks by the Levenberg–Marquardt algorithm is sensitive to the number of neurons in the hidden layer. The greater the number of neurons, the better the performance of the neural network in fitting the data. During the training process, the number of neurons in the hidden layer was changed while optimizing the transfer function for the given input and output vectors. The main problem involved during the optimization of the propagation function during the training process is the overfitting. Overfitting mainly occurs when too many neurons are in the hidden layer. Overfitting can be determined by the large error deviations between the experimental and the ANN predicted crystal growth rate for the new input data. Overfitting refers to exceeding some optimal ANN size, which may finally reduce the performance of ANN in predicting the targets. To avoid the problem of overfitting, ANN was trained using the learning data set with different number of neurons in the hidden layer, starting from the minimum of one neuron in the hidden layer. Several trials were made by increasing the number of neurons in the hidden layer to find a network that is good enough to determine the targets that correspond to the learning and testing data set. Several trials were conducted, because there are no methods available to determine the size of ANN for a specific application. In the present study, it is presumed that the ANN with the minimum number of neurons in the hidden layer predicts the Rg values of the testing and training dataset very well as the optimum size of network. Another problem involved during the optimization of propagation function during the training process is overtraining. Overtraining refers to the training time of ANN that will reduce the performance of ANN in predicting the targets. Overtraining will sometimes lead to poor prediction of the targets, because the network will memorize the training examples, but it does not generalize to the new experimental conditions. The process of improving the generalization is called regularization. The regularization step modifies the performance of the transfer function and reduces noise, thus avoiding the problem of overtraining and also overfitting. In the present study,

4920 Ind. Eng. Chem. Res., Vol. 47, No. 14, 2008

Figure 4. Training strategy of the constructed feed-forward artificial neural network.

to avoid the problems due to the overfitting, a Bayesian regularization technique, in combination with the Levenberg– Marquardt algorithm, was used during the ANN training process. The Bayesian algorithm works best when the network’s input and output are scaled within the range of –1 to +1.12 This method will stop automatically the training process when the algorithm is truly converged. The Bayesian regularization provides a measure of how many weights and biases are effectively used by the network. The Bayesian algorithm uses effectively and decides the number of network parameters, thereby eliminating the guesswork required in selecting the optimum network size. In the hidden layer, initially, three types of transfer functions;namely, the exponential sigmoid, tangent sigmoid, and linear functions;were tested while training the neural network. The linear function was used at the output layer. A tansigmoid function in the hidden layer and the linear function in the output layer are observed to be excellent in regard to predicting the growth rate of sucrose crystals, irrespective of the initial operating variable conditions. After several trials, the neural network with 10 neurons in the hidden layer was determined to be excellent, in regard to representing the sucrose crystal growth rate for the range of operating variables studied. Increasing the number of neurons to >10 does not affect the performance of ANN in regard to predicting the crystal growth kinetics of sucrose crystals. The Bayesian algorithm automatically uses the number of weights and biases, irrespective of the size of network parameters, for ANN with >10 hidden layers. Thus, the effective number of weights and biases in the constructed ANN is a constant, irrespective of the size of the ANN for ANNs greater than the 4-10-1 network. The present network thus uses 61 network parameters to optimize 114 data points to simulate the growth kinetics of sucrose crystals for the wide range of operating variables. Although the number of parameters in the ANN seems to be heavy, it is smaller when compared to the total number of data points in the training set. The Bayesian algorithm works well when the number of parameters involved in the ANN is kept lower than the number of data points. The effective number of parameters in the ANN remains more or less the same, because the Bayesian algorithm is independent of the number of degrees of freedom of the ANN. Furthermore, the Bayesian algorithm will always use a constant number of weights and biases for a specific application. However, the guesswork is required to determine the minimum optimum size of the ANN for the Bayesian algorithm to train the network effectively. In the case of sucrose crystallization kinetics, ANNs with >61 network parameters do not affect the performance of the ANN, in regard to predicting the targets. However, the ANNs with 0.98) for the wide range of the experimental conditions. Figure 6b shows that the ANN was successful in simulating the growth rate of sucrose crystals for the wide range of supersaturation under the range of experimental conditions used during the training process. This shows the developed neural network model can be precise in regard to predicting the sucrose crystallization kinetics for any experimental conditions, within the studied range. The kinetics of the sucrose crystallization process is a complex process. The kinetics of sucrose crystallization process is expected to be diffusion-controlled at higher temperatures and surface-integration-controlled at lower temperatures. In

Ind. Eng. Chem. Res., Vol. 47, No. 14, 2008 4923

addition, the present investigation and some of our previous works8,9 suggested a strong influence of the operating variables such as the temperature, supersaturation, agitation speed, and seed diameter on the crystallization kinetics. Furthermore, it would be highly complicated to propose a generalized expression correlating the operating variables involved in the system with the crystal growth rate. In the present investigation, the trained neural network model was determined to be highly precise in regard to predicting the growth rate of the sucrose crystallization process, showing its advantage over the traditional multidimensional regression approach. Although the constructed network was trained for a few operating conditions, it is always possible to introduce new inputs to train the network. Thus, the constructed network has the flexibility of retraining the network at anytime, whenever new experimental data are available. The intent of future work is to predict the crystallization process of sucrose in the presence of impurities and additives. We also plan to extend this idea to other crystallization systems. 5. Conclusion The application of an artifical neural network (ANN) in regard to predicting the growth rate of sucrose crystals in pure solutions was discussed. A back-propagation ANN with the hidden layer using a hyperbolic tangent propagation function and with a linear output layer was developed and trained to model the growth kinetics of sucrose crystals. The ANN trained with extensive experimental data was determined to be successful in regard to representing the experimental growth rate of sucrose crystals, irrespective of the operating conditions studied. The coefficient of determination (r2) between the experimentally determined crystal growth rate and the crystal growth rate determined by the neural network was determined to be 0.999 (∼1). The constructed ANN was successful in simulating the growth rate of sucrose crystals for new data. The coefficient of determination between the experimentally determined growth rate of sucrose crystals and the crystal growth rate determined by the trained neural network for a wide range of experimental conditions was observed to be >0.98. The ANN was determined to be highly precise, in regard to representing the kinetics of the batch sucrose crystallization process and a reliable method in predicting the growth of sucrose crystals, when compared to the conventional multiple nonlinear regression analysis. The sucrose crystallization kinetics is limited by diffusion or surface integration, depending on the temperature. Despite the limiting step, the

ANN successfully simulates the kinetics of the sucrose crystallization process at the studied temperatures. Supporting Information Available: The Matlab file of the constructed ANN. (TXT file.) This material is available free of charge via the Internet at http://pubs.acs.org. Literature Cited (1) Chow, H.; Chen, H.; Ng, T.; Myrdal, P.; Yalkowsky, S. H. Using back propagation networks for the estimation of aqueous activity coefficients of aromatic compounds. J. Chem. Inf. Comput. Sci. 1995, 35, 723–728. (2) Aber, S.; Daneshvar, N.; Soroureddin, S. M.; Chabk, A.; Zeynali, K. S. Study of acid orange 7 removal from aqueous solutions by powdered activated carbon and modeling of experimental results by artificial neural network. Desalination 2007, 211, 87–95. (3) Mohammadi, A. H.; Richon, D. Use of artificial networks for estimating the water content of natural gases. Ind. Eng. Chem. Res. 2007, 46, 1431–1438. (4) Naik, A. D.; Bhagwat, S. S. Optimization of an artificial neural network for modelling protein solubility. J. Chem. Eng. Data 2005, 50, 460–467. (5) Yang, M.; Wei, H. Application of neural network for the prediction of crystallization kinetics. Ind. Eng. Chem. Res. 2006, 45, 70–75. (6) Noever, D.; Pusey, M. L.; Forsythe, E.; Baskaran, S. Artificial neural network prediction of tetragonal lysozyme face growth rates. J. Cryst. Growth 1996, 167, 221–236. (7) Georgieva, P.; Meireles, M. J.; de Azevedo, S. Knowledge-based hybrid modelling of a batch crystallisation when accounting for nucleation, growth and agglomeration phenomena. Chem. Eng. Sci. 2003, 58, 3699– 3713. (8) Guimaraes, L.; Sa, S.; Bento, L. S. M.; Rocha, F. Investigation of crystal growth in a laboratory fluidized bed. Int. Sugar J. 1995, 97, 199– 204. (9) Martins, P. M.; Rocha, F. The role of diffusional resistance on crystal growth: Interpretation of dissolution and growth data. Chem. Eng. Sci. 2006, 61, 5686–5695. (10) Bubnik, Z.; Kadlec, P. Sucrose crystal shape factors. Zuckerindustrie (Berlin, Ger.) 1992, 117, 345–350. (11) Hagan, M. T.; Menhaj, M. B. Training feedforward networks with the Marquardt algorithm. IEEE Trans. Neural Networks 1994, 5 (6), 989– 993. (12) Neural Network Toolbox, User Guide; The MathWorks, Inc.: Natick, MA, 2000. (13) Tetko, I. V.; Livinsgstone, D. J.; Luik, A. I. Neural network studies. 1. Comparison of overfitting and overtraining. J. Chem. Inf. Comput. Sci. 1995, 35, 826–833. (14) Zhang, X.; Zhang, S.; He, X. Prediction of solubility of lysozymeNaCl-H2O system with artificial neural network. J. Cryst. Growth 2004, 264, 409–416.

ReceiVed for reView December 14, 2007 ReVised manuscript receiVed March 5, 2008 Accepted March 6, 2008 IE701706V