Ind. Eng. Chem. Res. 1998, 37, 2081-2085
2081
Hydroxylation of Phenol to Dihydroxybenzenes: Development of Artificial Neural-Network-Based Process Identification and Model Predictive Control Strategies for a Pilot Plant Scale Reactor Shilpa B. Tendulkar,† Sanjeev S. Tambe,† Ishwar Chandra,‡ P. V. Rao,‡ R. V. Naik,‡ and B. D. Kulkarni*,† Chemical Engineering Division and Process Development Division, National Chemical Laboratory, Pune 411 008, India
Experiments in a pilot-scale fixed-bed reactor system have been conducted to obtain the process input-output data for the titanium-based zeolite-catalyzed hydroxylation of phenol to dihydroxybenzenes. An artificial neural-network-based strategy has been used for identifying the process model covering the range of experimental conditions. The identified neural network model has been used to design a model predictive controller that has been tested on the experimental rig. Introduction Catechol and hydroquinone are two of the many phenolic derivatives of high value. They are widely used as photography chemicals, antioxidants, and polymerization inhibitors and also used in pesticides, flavoring agents, and medicine. Hydroxylation of phenol using a hydrogen peroxide oxidant and titanium-zeolite catalyst to give catechol and hydroquinone has been studied by a large number of workers (viz., Jacobs, 1992; Martens et al., 1993). The mechanism of reaction deactivation and regeneration, role of pore size, and solvent effect have also been studied in detail (Reddy et al., 1992; Esposito et al., 1992; Ratnasamy and Sivasanker, 1996). The National Chemical Laboratory (NCL) at Pune, India, has developed a titanium silicalite (a zeolite) catalyst for the hydroxylation of phenol to produce catechol and hydroquinone. The NCL process uses a continuously operated fixed-bed reactor system employing liquid-phase reactants. This process has an advantage in that it is free from tiresome operations such as filtration, catalyst makeup, and frequent catalyst regeneration. The motivation for this study is to develop artificial neural-network-based (ANN-based) nonlinear process identification and model predictive control strategies for the phenol hydroxylation process. The paper is organized as follows. First, a broad outline of the ANN-based nonlinear control and identification techniques is provided. Next, identification of the phenol hydroxylation process using an ANN as a forward process model is explained followed by the description and implementation of the neural model predictive control (NMPC) strategy. In the last decade, model predictive control (MPC) formalisms that employ an explicit model to predict the future process outputs, and minimize the error (difference between the model predicted output and its desired value) by calculating suitable control moves, have been extensively used to control processes with complex * Author to whom correspondence should be addressed. Telephone: 91 (212) 367295. Fax: 91 (212) 333941. E-mail:
[email protected]. † Chemical Engineering Division. ‡ Process Development Division.
dynamics. The specific examples of MPC strategies are internal model control (IMC) (Garcia and Morari, 1982), dynamic matrix control (DMC) (Cutler and Ramaker, 1980), and model algorithmic control (MAC) (Rouhani and Mehra, 1982). These control formalisms utilize linear/linearized models even when systems behave nonlinearly and thus become vulnerable to modeling errors that eventually lead to suboptimal control performance. Also, linearized models by their very nature are valid over a limited range of operating conditions. To alleviate these problems, several nonlinear model predictive and related control strategies have been developed (viz., Brengel and Seider, 1989; Patwardhan et al., 1990; Sistu and Bequette, 1991; Kulkarni et al., 1991; Kumar et al., 1991). Process models developed from the knowledge of fundamental principles, such as the conservation laws of mass, energy, and momentum often turn out to be complex since most chemical processes exhibit nonlinear characteristics and, thus, require simplifying assumptions for their solution. A lack of sufficient understanding of the involved physical phenomena compounds the problem further. Owing to such difficulties, process identification techniques are used to develop nonparametric [e.g., autoregressive moving average (ARMA)] process models which are constructed exclusively from the experimental process input-output data. The drawback of such techniques, however, is that the model structure must be specified a priorisa difficult task since it involves selecting by trial and error a reasonable model from the numerous alternatives. Use of ANNs for the model development in recent years has overcome many of the drawbacks alluded to above. ANNs, such as the error back propagation network (EBPN) approximate complex nonlinear functional relationships, and this ability has lent them for applications in nonlinear process identification and control of chemical processes (see, for example, Nahas et al., 1992; Selvaraj et al., 1995; Ramasamy et al., 1995; and reviews by Hunt et al., 1992; Agarwal, 1997). Typically, an ANN is used as a forward process model wherein the current and lagged values of the process state and the manipulated variable serve as the network inputs. The network predicts as its output the process state at the next
S0888-5885(97)00509-5 CCC: $15.00 © 1998 American Chemical Society Published on Web 01/07/1998
2082 Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998
Figure 1. Experimental setup for hydroxylation of phenol.
sampling instant. Once an ANN-based process model is developed, it can be readily used in the MPC framework. The ANNs being nonparametric models have the advantage that the model coefficients can be estimated directly from the samples of process input-output data. The EBPN (Rumelhart et al., 1986) is a multilayer feed-forward structure usually comprising three (input, hidden, and output) layers of processing elements (termed as “nodes”). An EBP network learns the relationship between the process inputs and outputs via a training procedure, wherein these are used as the network input and the desired output, respectively. Network training involves minimization of an error function using a steepest descent strategy known as the generalized delta rule (GDR) wherein the network outputs are compared with their desired values and the difference (error) is used to modify the interlayer connection weights. The description of GDR can be found at numerous places (e.g., Freeman and Skapura, 1991; Tambe et al., 1996). In the following, we present results wherein an EBPN has been utilized for the identification and control of phenol hydroxylation to dihydroxybenzenes process. Process Description The phenol hydroxylation reaction described by
C6H5OH + H2O2 f C6H4(OH)2 + H2O, ∆HR (25 °C) ) 66 kcal/gmol was carried out in an aqueous medium at 80 °C. Apart from DHBs (dihydroxybenzenes) and water, the reaction products also contain small amounts of benzoquinone and tar. The experimental objective was to optimize conditions with respect to the formation of dihydroxybenzene isomers, particularly p-DHB (hydroquinone) and o-DHB (catechol). The experimental unit (Figure 1) essentially consists of a 2.2 m tall-jacketed tubular reactor made of 25-mm nominal diameter SS 316 pipe, two diaphragm-type metering pumps, a thermostatically controlled water bath, and associated equipment. The reactor is filled
with catalyst particles in the extrudate form to a height of 2 m and cooled by means of circulating water. Two axially placed thermocouple assemblies comprising a total of 10 thermocouples are fitted to the reactor from each end. The two reactants are fed separately using diaphragm metering pumps. The reactants flow through the reactor upward and can be pumped with a high degree of precision to effect changes in the weighthourly-space-velocity (WHSV) and mole ratio. The phenol stream is a saturated solution of water-in-phenol (24 wt % water) and the peroxide solution is 16% H2O2. The reactor operating conditions were (i) phenol (74 wt %) flow rate ) 944 g/h (7.43 gmol/h); (ii) H2O2 (16 wt %) flow rate ) 160 g/h (0.753 gmol/h); (iii) catalyst weight ) 625 g; (iv) WHSV based on phenol ) 1.5 h-1; (v) bulk density as stacked ) 0.65 g/cm, and (vi) coolant temperature ) 80-82 °C. Under a fixed-bed mode of operation, the maximum conversion based on H2O2 consumption was to the extent of 95-96% and the hydroquinone to catechol yield was in the 1.5:1 ratio. The phenol hydroxylation is an exothermic reaction and the reactor temperature specifically at the position corresponding to thermocouple no. 1 (shown as TT-1 in Figure 1) exhibits a sensitive behavior with respect to the input flow rates. If temperature at this spot exceeds 100 °C, it adversely affects the product yield. For identification purposes, it is possible to use the flow rate of either a phenol-water mixture, H2O2, or a coolant as the manipulated variable and examine its effect on any one of the process output variables such as phenol conversion, temperature at TT-1, product concentration, etc. In this study, the flow rate of phenol and the temperature at TT-1 have been chosen as the manipulated and controlled variables, respectively. ANN-Based Process Identification The dynamic EBPN-based forward process model was identified off-line using the open-loop input-output data for which the phenol flow rate (up) was varied randomly and the reactor temperature at location TT-1 (T) was monitored. The extent of variations in the phenol flow rate was based on a pseudorandom sequence wherein the probability of variation was 0.25. The data so gathered comprise a discrete time series of T as a function of the process input up. A total of 123 data points separated by 5-min intervals were collected (see Figure 2). Prior to the network training, the inputoutput data was partitioned into two sets namely, the training set (80 points) and the test set (43 points). While the former was used to update the network weights, the latter (test set) was used to evaluate the generalization ability of the network model. To develop an EBPN-based forward process model following expression relating the one-step-ahead process output with the current and single-lagged values of the process input (up) and the output (T) was assumed:
T(k + 1) ) f[up(k), up(k - 1); T(k), T(k - 1)]; 2 e k e N - 1 (1) where k represents discrete time, N denotes the number of points in the training set, and f refers to the function to be approximated by the EBPN. Each input vector utilized for the network training comprises four elements described by [up(k), up(k - 1); T(k), T(k - 1)]. The
Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998 2083
Figure 2. Plots depicting phenol flow rate (up) and reactor temperature (T) as a function of time. For the generation of data, the phenol flow rate was varied using pseudorandom number sequence.
output corresponding to each input is a scalar representing the reactor temperature at (k + 1)th instant (i.e., T(k + 1)). During training, the GDR minimizes the sum-squarederror (SSE) function (E) defined as n
E)
∑ i)1
n
ei )
2 0.5(ym ∑ i - ti) i)1
(2)
where n denotes the number of output layer nodes and ti and ym i respectively represent the desired and actual outputs of the ith output layer neuron. Since for identification purposes only one process output is considered, the EBPN network comprises a single output neuron (n ) 1) and, thus, ym i and ti respectively signify one-step-ahead prediction of T and its desired value, T(k + 1). The input layer of EBPN consisted of four nodes representing the current and lagged values of the process input and output (see eq 1). To obtain optimal network weights that result in the least SSE for the test set, several independent training runs were performed by varying the GDR parameters, namely, the learning rate (η0) and momentum coefficient (R). The number of hidden layer nodes (NH) was also optimized in a similar fashion. The optimal values for NH, η0 and R were found to be 5, 0.3, and 0.1, respectively. The magnitudes of the correlation coefficient for the actual and model-fitted process outputs corresponding to the training and test sets were 0.984 and 0.986, respectively, and suggest that the network model possesses good approximation and generalization properties. Figure 3a,b depicts the actual and EBPN-predicted temperature profiles for the training and test sets. Neural Model Predictive Control (NMPC) To test the utility of the process model, it was used in the framework of NMPC (Ishida and Zhan, 1995). In this control scheme (Figure 4), the optimizer block, using the model output and set point as the inputs, computes an optimal control action. In particular, it uses the steepest descent methodology for adjusting iteratively
Figure 3. (a) Experimental and EBPN-predicted reactor temperature (T) profiles for the training set and (b) same as in panel (a) but for the test set.
Figure 4. The block diagram of the neural model predictive control (NMPC).
the network inputs representing the manipulated variable(s) until the error between the network-computed process outputs (after appropriate compensation for plant/model mismatch) and the set point falls below a prespecified small threshold. After convergence, the steepest descent search is terminated and the converged value of the manipulated variable is applied to the process. This procedure is repeated at the subsequent sampling instants. It is to be noted that while the input adjustments are carried out, the network weights are kept unchanged. The advantage of NMPC methodology is that all the control computations are performed in the domain of a single network serving as the forward process model, thus avoiding the need of training a
2084 Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998
separate neural network to act as a controller. In the following, the generalized NMPC strategy for multiple input-multiple output (MIMO) system is described for which the set point error E is redefined as n
E)
n
2 ei ) ∑0.5[ypred (k + 1) - ysp ∑ i i (k + 1)] i)1 i)1
(3)
where ysp i (k + 1) represents the set point of the ith (k controlled variable at the (k + 1)th instant and ypred i + 1) denotes the network predicted output duly compensated for plant/model mismatch. The mismatch compensation expression is given as
(k + 1) ) ym ypred i i (k + 1) + di(k);
i ) 1, 2, ..., n (4)
Here, ym i (k + 1) represents the model output at the (k + 1)th instant and the mismatch, di(k), is defined as p di(k) ) c[ypi (k) - ym i (k)], where yi (k) signifies the plant output of the ith controlled variable. The tunable parameter c improves controllability; as its magnitude is increased, the system shows a faster response toward the set point. The steepest descent expression used by the optimizer for updating the manipulated variable uj is given as
(k) ) uj(k) - η unew j
∂E ∂uj(k)
(5)
where η denotes the step size. The partial differential term on the right-hand side (rhs) can be evaluated by making use of eqs 3 and 4 as in the following:
∂E
n
)
∂uj(k)
[ypred (k + 1) - ysp ∑ i i (k + 1)] i)1 n
)
∑ i)1
[ypred (k i
+ 1) -
ysp i (k
+ 1)]
(k + 1) ∂ypred i ∂uj(k) ∂ym i (k + 1) ∂uj(k)
(6)
where the partial derivative term on the rhs represents the sensitivity of the ith network output with respect to the jth input and is evaluated by chain operation to the neural network as given by
∂ym i (k + 1) ∂uj(k)
m0
wliδlwjl ∑ l)1
) Oi(1 - Oi)
(7)
where Oi refers to the output of the ith node in the output layer, m0 denotes the number of nodes in the hidden layer, wli is the weight between the lth hidden node and ith output node, and wjl is the weight connecting the jth input node and lth hidden node. Equation 7 assumes the use of the logistic sigmoid transfer function at the output nodes. Taking the same transfer function also at the hidden nodes, δl, representing the transfer function derivative, is computed as δl ) xl (1 xl), where xl represents the output of the lth hidden node. It may be recalled that the neural network model identified earlier is of a multiple input-single output (MISO) type whereas the NMPC approach described
Figure 5. (a) Time profiles of model and process outputs during NMPC implementation and (b) phenol flow rate given by NMPC (η ) 0.3, c ) 3.5).
above is for MIMO systems. For an MISO system, the parameter n representing the number of output nodes assumes unit magnitude and, therefore, the subscript i in eqs 3-7 can simply be dropped. The equivalence between the NMPC notation and phenol hydroxylation process variables can now be given as uj(k) is the phenol flow rate at the kth instant (i.e., up(k)); ym(k + 1) is the one-step-ahead value of the controlled variable (T) predicted by the ANN model; yp is the reactor temperature T at position TT-1; ypred is the model-predicted reactor temperature after plant/model mismatch compensation; ysp is the set point for the controlled variable, T. For NMPC implementation, a simple servo control task is selected (i.e., to change the temperature T from 96 to 100 °C) using the phenol flow rate as the manipulated variable. The control is activated right at the beginning when the phenol and H2O2 flow rates were 944 and 160 g/h, respectively. At each 5-min interval the temperature was monitored and the phenol flow rate to reach the set point was computed. The time profiles of the model and process outputs are shown in Figure 5a whereas the phenol flow rate profile given by NMPC is depicted in Figure 5b. As can be seen, the NMPC action is smooth and process reaches the set point without allowing any offset. Conclusion This paper reports results of an artificial neuralnetwork-based strategy for identifying the hydroxylation of the phenol to dihydroxybenzenes process. The ANN model covers the range of experimental conditions and
Ind. Eng. Chem. Res., Vol. 37, No. 6, 1998 2085
predicts one-step-ahead process output. Next, the identified process model has been used to design a neural model predictive controller. The process identification and NMPC strategies exemplified here are simple to implement and based only on the process input-output data. Dedication We dedicate this paper to our beloved teacher Prof. L. K. Doraiswamy who during his tenure at NCL strived hard to exploit modern tools for the benefit of chemical engineering practice. Acknowledgment The authors gratefully acknowledge the financial support by the Department of Science and Technology (DST), Government of India, New Delhi. Literature Cited Agarwal, M. A Systematic Classification of Neural-Network-Based Control. IEEE Control Sys. 1997, April, 75-93. Brengel, D. D.; Seider, W. D. Multi-step Nonlinear Predictive Controller. Ind. Eng. Chem. Res. 1989, 28, 1812-1822. Cutler, C. R.; Ramaker, B. L. Dynamic Matrix ControlsA Computer Control Algorithm. AIChE 86th National Meeting, Houston, Texas, April, 1980. Esposito, E.; Maspero, F.; Romano, U. Selective Oxidations with Titanium Silicalite-1 (TS-1). DGMK-Conference Report, 1992; Erdgas und Kohle e.v. (DGMK): Hamburg, Germany, 1992; p 195. Freeman, J. A.; Skapura, D. M. Neural Networks Algorithm, Application, and Programming Techniques; Addison-Wesley: Reading, MA, 1991. Garcia, C. E.; Morari, M. Internal Model Controls1. A Unifying Review and Some New Results. Ind. Eng. Chem. Process Des. Dev. 1982, 21, 308-323. Hunt, K. J.; Sbarbaro, D.; Zbikowski, R.; Gawthrop, P. J. Neural Networks for Control SystemssA survey. Automatica 1992, 28, 1083-1112. Ishida, M.; Zhan, J. Neural Model Predictive Control of Distributed Parameter Crystal Growth Process. AIChE J. 1995, 41, 23332336.
Jacobs, P. A. Titanium Silicalites and Their Catalytic Properties for Oxyfunctionalisation of Hydrocarbons with Hydrogen Peroxide. DGMK-Conference Report, 1992; Erdgas und Kohle e.v. (DGMK): Hamburg, Germany, 1992; p 171. Kulkarni, B. D.; Tambe, S. S.; Shukla, N. V.; Deshpande, P. B. Nonlinear pH Control. Chem. Eng. Sci. 1991, 46, 995-1003. Kumar, V. R.; Kulkarni, B. D.; Deshpande, P. B. On the Robust Control of Nonlinear Systems. Proc. R. Soc. London 1991, 433, 711-722. Martens, J. A.; Buskens, Ph.; Jacobs, P. A. Hydroxylation of Phenol with Hydrogen Peroxide on EUROTS-1 Catalyst. Appl. Catal. A: General 1993, 99, 71-84. Nahas, E. P.; Henson, M. A.; Seborg, D. E. Nonlinear Model Control Strategy for Neural Network Models. Comput. Chem. Eng. 1992, 16, 1039-1057. Patwardhan, A. A; Rawlings, J. B.; Edgar, T. F. Nonlinear Model Predictive Control. Chem. Eng. Commun. 1990, 87, 123-130. Ramasamy, S.; Deshpande, P. B.; Paxton, G. E.; Hajare, R. P. Consider Neural Networks for Process Identification. Hydrocarbon Process. 1995, June, 59-62. Ratnasamy, P.; Sivasanker, S. Process for the Conversion of Phenol to Hydroquinone and Catechol. U.S. Patent 5, 493 061, 1996. Reddy, S. J.; Sivasanker, S.; Ratnasamy, P. Hydroxylation of Phenol over TS-2, a Titanium Silicate Molecular Sieve. J. Mol. Catal. 1992, 71, 373-381. Rouhani, R.; Mehra, R. K. Model Algorithmic Control (MAC) Basic Theoretical Properties. Automatica 1982, 18 (4), 401. Rumelhart, D. E.; Hinton, G. E.; Williams, R. J. Learning Internal Representations by Error Propagation. Parallel Distributed Processing: Explorations in the Microstructure of Cognition; MIT Press: Cambridge, 1986; Vol. 1. Selvaraj, R.; Deshpande, P. B.; Tambe, S. S.; Kulkarni, B. D. Neural Networks for the Identification of MSF Desalination Plants. Desalination 1995, 101, 185-193. Sistu, P. B.; Bequette, B. W. Nonlinear Predictive Control of Uncertain Processes: Application to a CSTR. AIChE J. 1991, 37, 1711-1723. Tambe, S. S.; Kulkarni, B. D.; Deshpande, P. B. Elements of Artificial Neural Networks with Selected Applications in Chemical Engineering, and Chemical and Biological Sciences; Simulation and Advanced Controls, Inc.: Louisville, KY, 1996.
Received for review July 22, 1997 Revised manuscript received November 21, 1997 Accepted November 24, 1997 IE970509B