Ind. Eng. Chem. Res. 2007, 46, 8607-8613
8607
Data-Driven Dynamic Modeling and Control of a Surface Aeration System Ankit B. Gandhi,† Jyeshtharaj B. Joshi,† Valadi K. Jayaraman,‡ and Bhaskar D. Kulkarni*,‡ Institute of Chemical Technology, Matunga, Mumbai-400 019, India, and Chemical Engineering & Process DeVelopment DiVision, National Chemical Laboratory, Pune-411008, India
In this study we have developed a support vector regression (SVR) based data-driven model for predicting two important design parameters of surface aerators, namely, the volumetric mass transfer coefficient (kLa) and fractional gas hold-up (G). The dynamical state of the surface aerator system was captured by acquiring pressure fluctuation signals (PFSs) at various design and operating conditions. The most informative features from PFS were extracted using the chaos analysis technique, which includes estimation of Lyapunov exponent, correlation dimensions, and Kolmogorov entropy. At similar conditions the values of kLa and G were also measured. Two different SVR models for predicting the volumetric mass transfer coefficient (kLa) and overall gas hold-up (G) as a function of chaotic invariants, design parameters, and operating parameters were developed showing test accuracies of 98.8% and 97.1%, respectively. Such SVM based models for the surface aerator can be potentially useful on a commercial scale for online monitoring and control of desired process output variables. 1. Introduction Data-driven modeling (DDM) employing artificial intelligence and machine learning methods is finding increasing relevance and importance in chemically reacting systems.1-5 The goal of DDM is to build a system that can adapt and learn from practical data. In this study we have developed a data-driven model for predicting important design parameters of surface aerators. The surface aerator is one such industrially important system that mainly finds application in wastewater treatment and in many chemical reactions such as hydrogenation, alkylation, and so forth.6 Parameters like the volumetric mass transfer coefficient kLa and fractional gas hold-up G are important for the design and to designate the performance of surface aerators. At the time of actual functionality of such a reactor, it is necessary to make use of a model system that can maintain or predict kLa and G values on the basis of the corresponding design and operating conditions. For such a reason the dynamical state of the reactor system was captured by acquiring pressure fluctuation signals (PFSs) at various designs and operating conditions. Figure 1 shows a typical PFS at particular design and operating conditions. To avoid the limitation of intensive computational work, it is necessary to extract the most informative features from PFS before building a data-driven model. In addition to the frequency and the statistical approaches of analyzing PFS, the concept and the related techniques of chaos have been successfully applied to extract information from the PFS.5,7-10 More specifically the chaotic invariants used for the present analysis of the PFS are the largest Lyapunov exponent, correlation dimensions, and Kolmogorov entropy. These PFSs have been quantified as/proven to be chaotic by the so-obtained positive value of the Lyapunov exponent (λ) and (0 < K< ∞) value of Kolmogorov entropy (K).9 Such a behavior depicted by PFS obtained from the surface aerator supports the dynamic modeling strategy. At corresponding design and operating conditions the values of kLa and G were also measured. An attempt has been made
Figure 1. PFSs of surface aerator for T ) 1.5 m, D ) 0.5 m, Nimp ) 2 rps, and DT type of impeller.
to generate a data-driven model by establishing a nonlinear relationship between chaotic invariants, design parameters, operating parameters, and kLa and G. In the past decade, several machine learning and artificial intelligence based methodologies1-5 have emerged as attractive tools for nonlinear modeling. DDM based on recently developed support vector machines has been found to be very attractive in chemical process engineering applications.1,3 For surface aerators, application of the support vector regression (SVR) technique makes it possible to obtain a nonlinear relationship between the measurement sensing (input process) and the measurement purpose (output of process), that is, (kLa and G) ) f(λ, K, CD, design and operating parameters). The nonlinear relationship thus obtained as a model works for any unknown combination of input values of dynamic and geometric parameters. Any deviation in these values from ideality is identified using such a model and helps to take online corrective steps to maintain the performance of the reactor. 2. Experimental Section
* To whom correspondence should be addressed. Tel: +91-205893095. Fax: +91-20-5893041. E-mail:
[email protected]. † Institute of Chemical Technology. ‡ National Chemical Laboratory.
The schematic of the experimental setup is shown in Figure 2. Experiments were carried out in flat-bottomed, cylindrical, acrylic tanks (of diameter, T ) 0.5 and 1.5 m), fitted with four
10.1021/ie0700765 CCC: $37.00 © 2007 American Chemical Society Published on Web 05/15/2007
8608
Ind. Eng. Chem. Res., Vol. 46, No. 25, 2007
signal p(n) to an m-dimensional phase vector y(n) via
y(n) ) (p(n), p(n - τ), p(n - 2τ), ..., p(n - (m - 1)τ)
Figure 2. Schematic of the experimental setup: (a) surface aerator, (b) transmitter, (c) battery, (d) channel box, (e) A/D converter, and (f) personal computer.
10% strip baffles. The system fluids used were air and tap water. The ratio of impeller clearance to tank diameter (C/T) was kept constant throughout to 0.58. Experiments were performed with three liquid levels (HL/T ) 0.62, 0.66, and 0.70). Impeller speed was varied from 2 to 6 rps and 0 to 2 rps for 0.5 and 1.5 m diameter tanks, respectively. Three impeller designs, namely, pitched blade turbine up flow (PBTU), pitched blade turbine down flow (PBTD), and disc turbine (DT) were employed in both the tanks. The impeller diameters used were D ) 0.15 m for the 0.5 m tank and 0.5 m diameter for 1.5 m tank, respectively. To acquire the dynamic PFSs, a pressure sensor based on the principle of strain gauge was mounted in line to the impeller, flushed to the wall. Pressure signals were monitored for a duration of 20 s. A transmitter supplied the excitation voltage for the sensor and also amplified the signal from the sensor. The analog signals from the transmitter were then subjected to digitization using an Advantech ADC card. The data rate was 2000 Hz, and 40 000 data points were collected for each measurement over a period of 20 s. These pressure time series were acquired by using Lab View software and Lab View compatible Advantech 32 bit DLL drivers. 3. Chaotic Time-Series Analysis A system sensitive to initial conditions and to some extent predictable over a restricted time interval is known as a chaotic system. The chaos theory is based on few basic parameters such as attractor, fractals, correlation dimensions, Lyapunov exponent, and so forth. The multiphase reactors exhibit nonlinear hydrodynamics as a result of the chaotic motions of bubbles and highly turbulent interaction between phases, resulting in the complex spatio-temporal pattern of the flow structure. The most important concept to analyze is chaos from a complex time series to unfold the hidden chaotic attractor. As the chaotic attractor can be unfolded from a single pressure-time series, several novel chaotic applications such as prediction,7,10 monitoring and controlling,10 and scale-up5,11 have been developed and implemented in the real life chaotic systems. Here an attractor/phase-space represents the time evolution of a hydrodynamical system. It is understood that an attractor designates a unique fingerprint of the dynamics of the system. The attractor of a chaotic system is neither finite nor a limit cycle, but it never returns to the same state again. These are called strange attractors. Unfolding of the time series into an attractor depends upon proper selection of two parameters, namely, time delay and embedding dimensions (number of the degrees of freedom of the system in true state space). 3.1. Time Delay. The first step involves reconstructing the attractor dynamics from a single time series p(n), n ) 1, 2, ..., P (number of recorded data points). The time-delayed phasespace reconstruction is used to transform the one-dimensional
(1)
where τ denotes the time delay and m is the embedding dimension. With proper selections of the time delay and the embedding dimension, the chaotic attractor is revealed in the m-dimensional phase space spanned by the m-axes, which are the components of y(n) in eq 1. In the very beginning of the reconstruction process, choice of the time delay is crucial. Time delay cannot be too small so that axes are closely related. On the other hand, τ cannot be too large to avoid the loss of information between axes. In this portion, the delay time was decided on the criteria of the first local minimum in mutual information function.12 3.2. Embedding Dimension. Even after a suitable delay time has been found, an appropriate embedding dimension m is a subjective decision. It should be taken into account to ensure that the attractor is fully unfolded in the state space with no crossing of orbits/trajectories of the attractor. If m is selected too small, the delayed phase space cannot completely unfold the attractor such that false nearest neighbors will occur. The false nearest neighbors mean that two distant points will behave like neighboring points because of the false projection. By sequential increase of the embedding dimension and computation of the corresponding percentage of false neighbors, the minimal embedding dimension can be easily obtained.13 Detailed methodology for estimating chaotic invariants, namely, the largest Lyapunov exponent, correlation dimension, and Kolmogorov entropy, from the strange attractor is described below. 3.3. Largest Lyapunov Exponent. For a dynamical system, sensitivity to initial conditions, that is, chaos, is quantified by the Lyapunov exponents (bits/s), or the Lyapunov spectrum. For example, consider two trajectories of a chaotic attractor with nearby initial conditions on an attracting manifold. When the attractor is chaotic, the trajectories diverge, on average, at an exponential rate characterized by the largest (positive) Lyapunov exponent.
λi ) lim log nf∞
i(t) i(0)
(2)
Shown above in eq 2, i(t) and i(0) are the distances between the two nearby orbits along the principal axis i at t ) 0 and t, respectively. Although nearby trajectories on the strange attractor is a combination of divergence and convergence, the overall dynamics is dissipative, that is, globally stable, and the total rate of contraction outweighs the total rate of expansion. 3.4. Kolmogorov Entropy. Systems with a strange attractor have a limited predictability. The degree of predictability, however, varies according to the system. The predictability can be quantified as follows. Consider two nearby points that lie on different orbits of the attractor. These points represent two states of the system, separated in time, with almost equal conditions. The rate at which the orbits separate expresses the “degree of unpredictability”. It is quantified by the Kolmogorov entropy K (bits/s), a characteristic invariant proportional to the rate of separation of two orbits. The sensitivity to small disturbances is caused by the fact that the system creates information during its time evolution, making long term predictions almost impossible. The limiting values of K are ∞ for the completely random system and 0 for the completely periodic system. The amount of information needed to specify
Ind. Eng. Chem. Res., Vol. 46, No. 25, 2007 8609
the evolution in time of the system during the interval (t1, t2), given the information I, is given as (eq 3),
I(t1, t2) ) It1 + K(t1 - t2)
(3)
Computations of the Lyapunov spectrum not only can determine whether a system is chaotic but also can provide the other chaotic invariants. K is the sum of the positive Lyapunov exponents and is determined by the largest value of the positive exponent. k
K)
λi ∑ i)1
(4)
where k is the number of first positive Lyapunov exponents. 3.5. Fractal Dimension (Correlation Dimension). The other frequently used chaotic measure is the fractal dimension, which quantifies the complexity, or degree of freedom of the chaotic attractors. The fractal dimension is defined in several ways owing to different viewpoints. Box-counting, Lyapunov dimension, and correlation dimension are among the most frequently encountered. Fortunately, they are extremely close to each other. Therefore, the most popular fractal dimension, the correlation dimension, CD, was considered in this study for its computational efficiency. The idea is to construct a function C(r) known as the correlation integral, which is the probability that two arbitrary points on the orbit of a strange attractor are closer together than radius r, which is given as
C(r) ) lim Nf∞
1
Np
∑ H(r - yi(n) - yj (n))
Np2i,j)1
(5)
output (target) value. In most of the practical cases, the input data cannot be correlated to the required output linearly. The options available in such a case are to use a nonlinear regression paradigm like nonlinear least-squares regression or ANN, to deal with the nonlinearity associated with the data set. However, as mentioned before, these techniques often suffer the disadvantage of being trapped in the poor local minima, resulting in poor prediction accuracy. The SVR formulation (eq 7) overcomes such a problem by first mapping the data into high dimensional feature space, F(x f φ(x)) and subsequently regressing it linearly. So the basic linear SVR function is given by
f(x, w) ) (w, φ(x) + b)
where H ) the Heaviside function and returns a value of unity if the distance between the two points is less than the hyperspace radius; otherwise, a value of zero is returned, and r ) radius of an m-dimensional hyperspace centered on each sampled point on the trajectory, yi(n), i ) 1, 2, 3, ..., Np. The correlation sum scales with the hypersphere radius according to a power law of the form
C(r)RrCD
Figure 3. A schematic illustration of SVR using the -sensitive loss function by Nandi et al.3
(6)
where the exponent CD is the correlation dimension. It is defined as the slope of the curve log C(r) versus log r. A noninteger result for the correlation dimension indicates that the data is probably fractal. For too low or too high r values, the results are inaccurate, so the slope must be measured in the midrange of the curve. 4. SVR Based Modeling SVR is a DDM tool based rigorously on statistical learning theory. This recently developed machine learning formalism is gaining popularity because of its many attractive features and promising empirical performance. The prominent salient feature of SVR is that the inputs are nonlinearly mapped into a highdimensional feature space with the aid of kernel function, wherein they are correlated linearly with the output. Such a linear regression in high dimensional feature space reduces the algorithm complexity, enabling high predictive capabilities of both training and unseen test examples. 4.1. Mathematical Formulation. Consider a training data set D ) {(x1, y1), (x2, y2), ..., (xN, yN)}, such that xi ∈R N is a vector of input variables and yi ∈ R N is the corresponding scalar
(7)
where f ) regression function, w ) weight vector, b ) constant (bias), and (w, φ(x)) is the dot product in the feature space, F. Here, the objective is to build a -SVR model16,17 to fit a regression function which can be employed to predict accurately the outputs corresponding to a new set of input examples. is a precision parameter representing the radius of the tube located around the regression function (Figure 3). The region enclosed by the tube is known as the “-insensitive” zone, and the regression function to be established should be with at most deviation (with some allowance for deviation) from the obtained outputs yi and at the same time should be as flat as possible. This allowance is represented in terms of slack variables ξi and ξi*, which assumes nonzero values outside the [, -] region and measure the deviation (penalty, yi - f(xi)) from the boundaries of the -insensitive zone. So to achieve a flat function (eq 7), one has to minimize the weight vector “w”, which can be ensured by minimizing the Euclidean norm, |w2| and the penalty by minimizing the slack variables ξi(*). Mathematically it can be represented as N 1 min f (w, ξi(*)) ) |w2 | + C ξi + ξi* 2 i)1
∑
s.t.
((w,φ (x)) + b - yi) e + ξi* (yi - (w, φ(x)) - b e + ξi* ξi, ξi* g 0
(8)
8610
Ind. Eng. Chem. Res., Vol. 46, No. 25, 2007
Table 1. Process Data Used for Development of SVR Based kLa and EG Models
expt no.
T (m)
type of impeller
1-9 10-18 19-27 28-36 37-45 46-54 55-63 64-72 73-79 80-86 87-93 94-100 101-107 108-114 115-119 120-126 127-133 134-140
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5
PBTU PBTU PBTU PBTD PBTD PBTD DT DT DT PBTU PBTU PBTU PBTD PBTD PBTD DT DT DT
D (m)
C (m)
HL (m)
Nimp (rps)
λ (bits/s)
CD
0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50
0.29 0.29 0.29 0.29 0.29 0.29 0.29 0.29 0.29 0.87 0.87 0.87 0.87 0.87 0.87 0.87 0.87 0.87
0.31 0.33 0.35 0.31 0.33 0.35 0.31 0.33 0.35 0.93 0.97 1.02 0.93 0.97 1.02 0.93 0.97 1.02
2.0-6.0 2.0-6.0 2.0-6.0 2.0-6.0 2.0-6.0 2.0-6.0 2.0-6.0 2.0-6.0 2.0-6.0 1.0-2.0 1.0-2.0 1.0-2.0 1.0-2.0 1.0-2.0 1.3-2.0 1.0-2.0 1.0-2.0 1.0-2.0
0.15-0.33 0.17-0.31 0.10-0.26 0.15-0.29 0.09-0.31 0.10-0.22 0.18-0.39 0.12-0.31 0.18-0.27 0.12-0.15 0.11-0.15 0.10-0.13 0.11-0.13 0.11-0.13 0.11-0.12 0.10-0.15 0.10-0.14 0.12-0.13
2.5-4.2 1.7-3.8 2.1-3.4 2.5-4.1 2.5-4.2 2.1-3.4 1.8-3.9 1.8-4.6 1.9-3.4 1.7-3.0 1.4-2.8 1.4-2.5 1.8-2.3 1.9-2.3 1.7-2.3 1.5-2.7 1.8-3.5 1.8-2.4
Table 2. Parameter Selection for SVR Based kLa and EG Models parameter
model for mass transfer coefficient
model for fractional gas hold-up
C γ ) 1/(2σ2)
10 1.1 0.0001
35 1.54 0.0001
This penalty is regulated with the aid of the cost function, C, employed to obtain a tradeoff between the flatness f and the amount to which deviations larger than can be tolerated. Figure 3 depicts the situation graphically, where the points outside the tube contribute to the cost insofar as the deviations are penalized in the linear fashion. After certain algebraic manipulations, an equivalent dual optimization problem can be obtained as N
max L(Ri, j ) ) (*)
1
N
yi (Ri - Ri*) - ∑(Ri + Ri*) ∑ i)1 i)1 N
∑∑(Ri - Ri*)(Rj - Rj*) (φ (xi), φ (xj))
fractional gas hold-up, G
87-378 133-380 112-297 72-371 104-320 54-146 93-428 84-399 180-326 78-204 74-175 85-94 76-162 80-131 72-101 54-298 65-274 112-124
0.001-0.024 0.001-0.046 0.001-0.048 0.002-0.017 0.001-0.022 0.001-0.006 0.002-0.016 0.001-0.022 0.001-0.021 0.001-0.019 0.001-0.014 0.001-0.014 0.001-0.011 0.001-0.014 0.001-0.006 0.002-0.008 0.001-0.015 0.002-0.012
0.001-0.025 0.001-0.023 0.001-0.016 0.012-0.039 0.011-0.056 0.001-0.008 0.004-0.039 0.006-0.049 0.008-0.035 0.011-0.044 0.011-0.042 0.010-0.038 0.009-0.044 0.013-0.058 0.022-0.037 0.007-0.045 0.016-0.055 0.010-0.052
space. The aforementioned quadratic programming problem (QP) (eq 9) can now be solved in the input space itself to compute the unknown coefficients R and R*. It can be proved that only some of the coefficients, Ri - Ri*, are nonzero, and the corresponding input vectors, xi, are called support vectors (SVs). The SVs can be thought of as the most informative data points that compress the information content of the training set. Thus the SVs, xi, and the corresponding nonzero coefficients R and R* are substituted and can represent the value of weight vector,w, as Nsv
w)
(Ri - Ri*)φ(xi) ∑ i)1
(9) Nsv
f(x, R, R*) ) (w, φ(x) + b) )
s.t.
(R - R*)K(x, xi) + b ∑ i)1
(12)
C g Ri Ri* g 0
Also the bias parameter, b, can be computed by applying Karush-Kuhn-Tucker (KKT) conditions, which state that at the optimal solution the product between dual variables and constraints has to vanish.
N
(Ri - Ri*)yi ) 0 ∑ i)1 A transformation to the high dimensional feature space enables us to convert a nonlinear regression problem into a linear one. However, as a result of the increase in dimensions by many folds the problem becomes computationally intractable. This problem can be overcome by defining appropriate Kernel functions in place of the dot product of the input vectors in high dimensional feature space.
K(xi, xj) ) (φ(xi), φ(xj))
(11)
Thus, the final regression model can be defined with the aid of relatively small numbers of input vectors and the regression function (eq 7) can be written in the final form as14
N
2 i)1 j)1
K (bits/s)
volumetric mass transfer coefficient kLa (s-1)
(10)
The advantage of a kernel function is that the dot product in the feature space can now be computed without actually mapping the vectors, xi, into high dimensional space. Thus, using a suitable kernel function all the necessary computations can be performed implicitly in the input space instead of the feature
b ) {yi - (R - R*)K(x, xi) - } for Ri ∈ 〈0, C〉 b ) {yi - (R - R*)K(x, xi) + } for Ri* ∈ 〈0, C〉 (13) where xi and yi, respectively, denote the ith SV and the corresponding target output. For using these kernel functions, they have to satisfy the Mercer’s condition,14,16,17 which states that any positive semidefinite, symmetric kernel function, K, can be expressed as a dot product in the high-dimensional space. There exist several choices of kernel function K like linear, polynomial and Gaussian radial basis function (RBF). The most commonly used kernel function is the Gaussian RBF defined below as14
Ind. Eng. Chem. Res., Vol. 46, No. 25, 2007 8611
(
K(xi, xj) ) (φ(xi), φ(xj)) ) exp -
)
1 |x - xj |2 2 i 2σ i, j ) 1, ..., N (14)
where σ denotes the width of the RBF. 4.2. Procedure for Estimating Regression Function. The algorithm for the estimation of the final regression function can be given as follows: (i) Choose an appropriate training set of input and output features. (ii) Select a suitable kernel function: In our case we have chosen the RBF kernel function. (iii) Fix the values of algorithm parameters C, σ, and (best values of these parameters are obtained by using the k-fold cross validation procedure, which is explained in the subsequent section). For this selected value of σ, the kernel function can be computed with the help of training data set. (iv) Solve the QP problem to obtain nonzero Lagrange multipliers (weights, R - R*) and the corresponding SVs (see Supporting Information) which are further used for defining the final regression function (eq 12). (v) Use this final regression function (eq 12) to obtain the target output for any unknown input test vector, xtest. 5. Results and Discussion Results are based on the experiments performed on tanks of 0.5 and 1.5 m diameter. Besides impeller speed, the variation in the kLa and G values for surface aerators is also known to depend strongly on the impeller location below the liquid surface, denoted as impeller submergence. Such an effect of variation in impeller submergence on chaotic invariants was obtained for different impeller types. Table 1 shows the effect of impeller speeds on chaotic invariants for 0.5 and 1.5 m diameter tanks for various impeller types and submergence levels along with the values of kLa and G. These values of kLa and G were already reported using the dissolved oxygen probe method and visual observations, respectively, for the abovementioned design and operation conditions at UICT, India laboratory. Here, two SVR models for predicting the volumetric mass transfer coefficient (kLa ) y1) and overall gas hold-up (G ) y2) were developed using the training set given in Table 1 comprising in all 140 data points, with 9 parameter input vectors of x ) (T, impeller type, D, C, HL, Nimp, λ, CD, and K). Of the total number of available data points 125 of them were used as a training set, while the generalization performance of the SVR models was evaluated using the test set comprising the remaining 15 data points. In the present study, an SVR implementation known as “-SVR” in the LIBSVM software library18 has been used to develop the two SVR based models. The standard fivefold cross-validation3,14,15 was carried out to select the best model parameters (C, σ, and ), resulting in the least root-mean-square error (RMSE) and maximum prediction accuracy, where the entire training data set under consideration was split into five equal parts, of which each part was used as a test set against the model established using remaining four parts. Prior to that these data sets were scaled between [0, 1] to avoid any discrepancy occurring as a result of the wide difference in their absolute values. Various combinations of C, σ, and (within the pre-specified range) were tried for the grid search, and the one with the best cross-validation accuracy and least RMSE value are picked. RBF resulted in the least RMSE values and maximum correlation coefficient (CC) values for the training and test sets of the outputs, y1 and y2. The optimal values of the three SVR-specific parameters, namely, the width
Figure 4. Parity plot for (9) SVR and (]) MLR based models (train data): (A) mass transfer coefficient model and (B) fractional gas hold-up model. Table 3. Performance Indicators for kLa and EG Models model for mass transfer coefficient model MLR -SVR
model for fractional gas hold-up
performance parameter
training
testing
training
testing
RMSE CC RMSE CC
0.0063 0.65 0.00056 0.996
0.123 0.592 0.0024 0.988
0.0081 0.623 0.00076 0.995
0.172 0.583 0.0044 0.971
of the RBF kernel (σ), cost coefficient (C), and loss function parameter (), that minimized the ETrn and ETst values, respectively, corresponding to the kLa and G models, are listed in Table 2. Also listed are the values of CCs for the training and test set predictions along with the corresponding RMSE values for both the models in Table 3. The SVR model parity plots for the training data set (seen) for kLa and G have been shown in Figure 4A,B, respectively. The parity plots for test data (unseen data) of kLa and G are shown in Figure 5A,B, respectively. To check for the nonlinearity associated with the data set, a simple multiple linear regression (MLR) technique was employed to generate the linear regression models for the surface aerator. On the basis of the results shown in Table 3, a distinct difference in prediction performance (train and test) of the SVR
8612
Ind. Eng. Chem. Res., Vol. 46, No. 25, 2007
Delhi, for a Junior Research Fellowship (JRF). V.K.J. acknowledges Department of Science and Technology (DST), India, for its financial support. Supporting Information Available: List of SVs and corresponding Lagrange multipliers (weights) are available for SVR based kLa and G models, respectively (PDF). This material is available free of charge via Internet at http://pubs.acs.org. Nomenclature
Figure 5. Parity plot for (9) SVR and (]) MLR based models (test data): (A) mass transfer coefficient model and (B) fractional gas hold-up model.
based and MLR models is depicted. Parity plots for the MLR model for kLa and G have been shown in Figure 4A,B, respectively. Similarly the parity plots for test data (unseen data) of kLa and G are shown in Figure 5A,B, respectively. This clearly indicates that kLa and G depend nonlinearly on the input parameters. The results further indicate that SVR based datadriven models can be employed for reliable and robust prediction of target parameters in multiphase reactor modeling. 6. Conclusion SVR, a robust machine learning based nonlinear modeling paradigm possessing several desirable features was used to establish a nonlinear relation between chaotic invariants and kLa and G at various design and operating conditions. The proposed models exhibit reasonably high prediction accuracy and can be used for real-time prediction of kLa and G values for various combinations of desired input (design and dynamic) parameters. Such SVR based models for the surface aerator can be potentially useful on the commercial scale for online monitoring and control of desired process output variables. Acknowledgment Sincere thanks from A.B.G. are given to Council of Scientific and Industrial Research (CSIR), the Government of India, New
b ) bias term C ) impeller clearance/distance between the impeller and the reactor bottom C ) cost function CC ) correlation coefficient C(r) ) correlation integral CD ) correlation dimension D ) impeller diameter DT ) disc type impeller ETrn ) RMSE for training set ETst ) RMSE for test set F ) high dimensional feature space f(x) ) regression function H ) Heaviside function HL ) liquid height k ) first positive Lyapunov exponents K(xi, xj) ) kernel function K ) Kolmogorov entropy kLa ) volumetric mass transfer coefficient L ) lagrangian function (dual form) m ) embedding dimensions Nimp ) impeller speed N ) number of training data points Np ) number of vector data points constituting attractor Nsv ) number of support vectors P ) number of recorded pressure data points PBTU ) pitched blade turbine up flow PBTD ) pitched blade turbine down flow p(n) ) one-dimension pressure data r ) radius of m-dimension hyper-space rps ) impeller revolutions per second R ) input space RMSE ) root-mean-square error T ) tank diameter xi ) ith input vector w ) weight vector yi ) target output corresponding to the ith vector y(n) ) m-dimension pressure vector Greek Letters G ) overall gas hold-up τ ) time delay λ ) Lyapunov exponent φ(xi) ) feature space for ith input vector x ξi(*) ) slack variables Ri,j(*) ) Lagrange multipliers ) precision parameter σ ) width of radial basis function (RBF) kernel γ ) 1/2σ2 ∞ ) infinity
Ind. Eng. Chem. Res., Vol. 46, No. 25, 2007 8613
Subscripts G ) gas phase Literature Cited (1) Agrawal, M.; Jade, A. M.; Jayaraman, V. K.; Kulkarni, B. D. Support Vector Machines: a Useful Tool for Process Engineering Applications. Chem. Eng. Prog. 2003, 98 (1), 57. (2) Lemoine, R.; Fillon, B.; Behkish, A.; Smith, A. E.; Morsi, B. I. Prediction of Gas-Liquid Volumetric Mass Transfer Coefficient in SurfaceAeration and Gas-Inducing Reactors Using Neural Networks. Chem. Eng. Process. 2003, 42, 621. (3) Nandi, S.; Badhe, Y.; Lonari, J.; Sridevi, U.; Rao, B. S.; Tambe, S. S.; Kulkarni, B. D. Hybrid Process Modeling and Optimization Strategies Integrating Neural Networks/Support Vector Regression and Genetic Algorithms: Study of Benzene Isopropylation on Hbeta Catalyst. Chem. Eng. J. 2004, 97, 115. (4) Supardan, M. D.; Maezawa, A.; Uchida, S. Determination of Local Gas Holdup and Volumetric Mass Transfer Coefficient in a Bubble Column by Means of an Ultrasonic Method and Neural Networks. Chem. Eng. Technol. 2003, 26, 1080. (5) Wei, C.; Tatsuya, H.; Atsushi, T.; Kentaro, O.; Yoshiki, S. Generalized Dynamic Modeling of Local Heat Transfer in Bubble Columns. Chem. Eng. J. 2003, 96, 37. (6) Patwardhan, A. W.; Joshi, J. B. Design of Stirred Vessels with Gas Entrained from free liquid surface. Can. J. Chem. Eng. 1998, 76, 339. (7) Letzel, H. M.; Schouten, J. C.; Krishna, R.; Van den Bleek, C. M. Characterization of Regimes and Regime Transitions in Bubble Columns by Chaos Analysis of Pressure Signals. Chem. Eng. Sci. 1997, 52 (24), 4447. (8) Lin, T.-J.; Juang, R.-C.; Chen, Y.-C.; Chen, C.-C. Predictions of Flow Transitions in a Bubble Column by Chaotic Time Series Analysis of Pressure Fluctuation signals. Chem. Eng. Sci. 2001, 56, 1057.
(9) Van den Bleek, C. M.; Schouten, J. C. Deterministic Chaos: A New Tool in Fluidized Bed Design and Operation. Chem. Eng. J. 1993, 53, 75. (10) Van den Bleek, C. M.; Schouten, J. C. Application of Chaos Analysis to Multiphase Reactors. Chem. Eng. Sci. 2002, 57, 4763. (11) Yano, T.; Kuramoto, K.; Tsutsumi, A.; Otawara, K. Y. Scale-up Effects in Nonlinear Dynamics of Three-Phase Reactors. Chem. Eng. Sci. 1999, 54, 5259. (12) Fraser, A. M.; Swinner, H. L. Independent Coordinates for Strange Attractors from Mutual Information. Phys. ReV. A 1986, 33, 1134. (13) Kennel, M. B.; Brown, R.; Abarbanel, H. D. I. Determining Minimum Embedding Dimension using a Geometrical Construction. Phys. ReV. A 1992, 45, 3403. (14) Gunn, S. R. Support Vector Machines for Classification and Regression. Technical Report, Department of Electronics and Computer Science; Technical Report: University of Southampton, 1998 (available at http://trevinca.ei.uvigo.es/∼cernadas/tc03/mc/SVM.pdf) (accessed Jan 2005). (15) Suykens, J. A (short) Introduction to Support Vector Machines and Kernel Based Learning. http://www.esat.kuleuven.ac.be/sista/members/ suykens.html, 2003 (accessed Nov 2004). (16) Smola, A. J.; Scholkopf, B. A Tutorial on Support Vector Regression. Statistics and Computing 2004, 14, 199. (17) Vapnik, V.; Golowich, S.; Smola, A. J. Support Vector Method for Function Approximation, Regression Estimation and Signal Processing. AdVances in Neural Information Processing Systems 1996, 9, 281. (18) Chang, C.-C.; Lin, C.-J. LIBSVM: A Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/∼cjlin/libsvm, 2001.
ReceiVed for reView January 12, 2007 ReVised manuscript receiVed March 28, 2007 Accepted April 3, 2007 IE0700765