Prediction of Solvent Activity in Polymer Systems with Neural Networks

Prediction of Solvent Activity in Polymer Systems with Neural Networks. Ramesh Keshavaraj, Richard W. Tock, Raghu S. Narayan, and Richard A. Bartsch. ...
0 downloads 5 Views 1MB Size
Ind. Eng. Chem. Res. 1995,34,3974-3980

3974

Prediction of Solvent Activity in Polymer Systems with Neural Networks Ramesh Keshavaraj,* Richard W. Tock, and Raghu S. Narayan Polymer Processing and Testing Laboratory, Department of Chemical Engineering, Texas Tech University, Lubbock, Texas 79401-3121

Richard A. Bartsch Department of Chemistry, Texas Tech University, Lubbock, Texas 79401-1061

It is a n arduous task to develop thermodynamic models or empirical equations which accurately predict solvent activities in polymer solutions. Even so, since Flory developed the well-known equation of state for polymer solutions, much work has been carried out in this area. Consequently, extensive experimental data have been published in the literature by various researchers on different polymer binary systems. When such data are available, then mode€ing solvent activity in polymer solutions can be simplified by the use of the artificial neural network (ANN) technique. The neural network technique has been in existence since 1969, when Grossberg introduced a network algorithm that could learn, remember, and reproduce any number of complicated space-time patterns. Nevertheless, the use of ANN to predict thermodynamic and fluid properties is still rather limited. In this paper, a n attempt has been made to predict the activity of several polymers in different solvents. We present a simple feedforward neural network architecture with a nonlinear optimization training routine for this purpose. The predictions of the proposed training routine were then compared with other traditional training routines such a s the error-back-propagation and Madaline 111. The predictions generated by all three algorithms were good, but the proposed algorithm was much faster and yielded better results. Background In many process design applications like polymerization and plasticization, the knowledge of the thermodynamics of polymer systems can be very useful. For example, nonideal solution behavior strongly governs the diffusion phenomena observed for polymer melts and concentrated solutions. Hence, accurate modeling of thermodynamic parameters, like the solvent activity, is a necessary requisite for proper design of many polymer processes. The well-known Flory treatment of the entropic contribution to the Gibbs energy of mixing of polymers with solvents is still the simplest and most reliable theory developed. It is quite apparent, however, that the Flory-Huggins theory was established on the basis of the experimental behavior of only a few mixtures investigated over a very narrow range of temperatures. Strict applications of Flory-Huggins require “x” to be constant, yet the temperature dependence of x can be substantial even in athermal systems. Although this parameter was initially introduced to account for the energetic interactions between the polymer and the solvent, it was later shown that it is convenient to regard it as a free-energy term with both an enthalpic and an entropic contribution.

While this approach can quantitatively describe a number of phenomena occurring in many polymer systems, it has some deficiencies. For more than 2 decades, researchers have attempted to overcome the inadequacies of Flory’s treatment in

* Corresponding author.

order to establish a model which will provide accurate predictions. Most of these research efforts can be grouped into two categories, i.e., attempts at corrections to the enthalpic or noncombinatorial part and modifications to the entropic or combinatorial part of the FloryHuggins theory. The more complex relationships derived by Huggins, Guggenheim, Stavermans and others required so many additional and poorly-determined parameters that this approach lacks practical applications. A review of the more serious deficiencies in the existing models was given in a recent article by Kontogeorgis et al. (1993). Despite their differences, the equations which have been developed are all complex and sometimes not fully consistent. Also, the predictions of these modified Flory-Huggins equations are less satisfactory, especially for polymer solutions in which strong polar forces must be considered. The purpose of this work was to develop a simple neural network based model with the capability to predict solvent activity in different polymer systems. The solvent activities were predicted by ANN as a function of the binary type and the polymer volume fraction, 4. Three different polymer solvent binaries were considered: namely, polar binaries, nonpolar binaries, and polarhonpolar binaries. The proposed neural network based model resulted in good agreement with the experimental data available for the tested polymer-solvent systems. Without this approach, the estimation of a proper correction for the Flory-Huggins equation t o give a solution model which will yield a reasonable prediction for all the polymer-solvent systems is time consuming. This is because considerable effort is needed to simultaneously optimize the different model parameters estimated from the experimental data. This latter effort could be substantially reduced, however, with the aid of artificial neural networks

Q888-5885l95I2634-3974$09.QQlQ 0 1995 American Chemical Society

Ind. Eng. Chem. Res., Vol. 34, No. 11, 1995 3975 Output signal 'S' to other nodes

Transfer function (sigmoidal)

neuron

Activation (summation)

xl

x2

x3

Inputs from other nodes

Figure 1. Microstructure of a n arbitrary neuron.

( A N N s ) . A N N s offer several advantages over traditional thermodynamic predictive tools as long as the experimental data are available for training and testing.

Neural Network Theory The ANN technique has been applied widely to subjects that include process fault diagnosis (Venkatasubramanian et al., 19901, process modeling and control (Bhat and McAvoy, 19901, and written character recognition (Rumelhart and McClelland, 1986) A N N s exhibit some properties of the human-thinking process. The A" is able to learn and to generalize from a few specific examples, and they are capable of recognizing patterns and signal characteristics in the presence of noise. A " s are inherently nonlinear and, therefore, capable of modeling nonlinear systems. Nevertheless, studies pertaining to the use of A N N s for predicting thermodynamic properties have been rather limited. The topology of neural networks is a logic structure in which multiple nodes communicate with each other through synapses that interconnect them. This topology is imitative of the structure and functions of biological nervous systems. The topology of interconnection and the rules employed by any neural network are generally lumped as the model of the network. The selected topology and the rules of operation are interrelated and are chosen by the experimenter t o implement a particular paradigm. Figure 1 shows the structure of a single node or neuron from an arbitrary network. The neuron in the figure, designated as the j t h neuron, occupies a position in its network that is quite general; that is, this neuron both accepts inputs from other neurons and sends outputs to other neurons. Any neuron in a totally interconnected network has this generality. Neural networks have a t least three layers of neurons: an input layer, a hidden layer, and a output layer. In a layered network, some neurons are specialized for either input or output; in such a network, it is only the interior or hidden nodes that maintain generality.

The generalized neuron gets its inputs from interconnections leading from the outputs of other neurons. Utilizing the biological terminology for the connections between nerve cells, these interconnections are also known as synapses. The synaptic connections are weighted. That is, when the ith neuron sends a signal to the j t h neuron, that signal is multiplied by the weighting on the i , j synapse. This weighting can be symbolized as w". If the output t o the ith neuron with patternp is designated as x j , p , then the input to the j t h neuron from the ith neuron is xis"+ Summing the weighted inputs to t h e j t h neuron gives

where ej is a bias term and w g , , is the weight of the connection from the bias neuron to thejth neuron. The bias can be thought of as an input with a constant value of f1.0. The bias weight is adjusted in very much the same way as the other connection weights are adjusted during the network training. This summing of the weighted inputs is carried out by a processor within the neuron. The sum that is obtained is called the activation of the neuron. This activation can be positive, zero, or negative, since the synaptic weightings and the inputs can be either positive or negative. Any weighted input that makes a positive contribution to the activation represents a stimulus (tending to turn the j t h neuron on), whereas one making a negative contribution represents an inhibition (tending t o turn the j t h neuron off). Activation is an internal operation of the neuron. This is particularly true for biological neurons. In the case of artificial neurons, the experimenter has more flexibility. The internal state of a neuron can be displayed just as easily as its output signal. By definition, however, the other neurons in the network see only the output signals of other neurons. After summing inputs to determine its activation, the neuron's next job is to apply a signal-transfer function to that activation and to determine an output. There are various possibilities for what this transfer function should be. A common formula for determining a neuron's output is through the use of a logistic function: 1

(3) where U i p is the activation and S i , p is the transformed output. This function belongs to a class of S-shaped or sigmoidal functions and has characteristics that are advantageous within the context of many paradigms. These characteristics include the fact that the function is continuous, that it has a derivative at all points, and that it is monotonicallyincreasing or asymptotic to 0 I Sip I1 and uip is the summed total of the inputs (-I uig I +-) for pattern p. Any transformation that accepts inputs having a n infinite range t o produce outputs over a finite range is known as a squushing function. In the authors' experience with ANN, this function is extremely fast in converging, and it also exhibits better stability than other types of triggering functions. The type of neural network architecture used in this study is known as a feed-forward neural network; i.e., each layer must feed sequentially into the next layer, with no feedback connections (recurrent). A general structure of a feed-forward neural network is shown in

3976 Ind. Eng. Chem. Res., Vol. 34,No. 11,1995 m o mo

where al, a2, ..., a, are independent variables, PI, Pz, ...,P k are the population values of k parameters, and F(y) is the expected value of the independent variable y. Then the data points can be denoted by

m

I

(YJlZ,Xzt,...,XmJ

i = 1,2, ..., n

The problem is to compute those estimates of the parameter which will minimize the following objective function

" E = Z [ Y L- PL12

(5)

,=I

BUS

mLAYl

m o r n rn Figure 2. Schematic diagram of a feed-forward neural network architecture.

Figure 2. Neural networks are organized in layers and typically consist of a t least three layers: an input layer; one or more hidden layers; and an output layer. The function of the inpuffoutput layer is to perform the appropriate scaling between the actual data and the network processed data. Training a Selected Neural Network. Once a network architecture has been selected, the network has t o be taught t o assimilate the functional dependencies of the variables in a given data set. Learning corresponds to an adjustment of the weights in order to obtain a satisfactory input-output match. Hence, it is very important to have data that adequately represent the relationships between the network input and output. Several different learning rules have been proposed by various researchers (Zurada, 1992; Battiti, 19921, but the aim of every learning process is to adjust the weights in order t o minimize the error between the predicted output by the network and the actual output. The nonlinear optimization training routine and two different training algorithms are presented in the following sections. Nonlinear Optimization Training Routine. A faster training process is to search for the weights with the help of a optimization routine that minimizes the same objective function. The learning rule used in this work is common to a standard nonlinear optimization or least-squares technique. Moreover, the entire set of weights are adjusted a t once instead of adjusting them sequentially from the output to the input layers. The weight adjustment was done at the end of each exposure of the entire training set to the network, and the s u m of squares of all errors for all patterns was used as the objective function for the optimization problem. The nonlinear optimization routine developed by Fletcher (1971) was used here for solving the nonlinear leastsquares problem. The optimization problem can be defined if the model to be fitted to the data is written as follows:

F(Y)= Aal,az,...,am;P1,P2,...,PJ = Aa,B)

(4)

where ?, is the value ofy predicted by the model at the ith data point. The parameters to be determined in our case are the strengths of the connections, i.e., the weights, w,. This algorithm shares with the gradient methods its ability t o converge from an initial guess which may be outside the range of convergence of other methods. It shares with the Taylor series method the ability to close in on the converged values rapidly after the vicinity of the converged values have been reached. The optimization procedure updated weights a t every connection and yielded rapid and robust training. Specific details about this routine are given elsewhere (Fletcher, 1971). Fletcher's routine, like most steepestdescent methods, has the ability to converge from an initial guess that lies outside the region of convergence by most other methods. Other methods of minimizing an objective function (Booth, 1955; Box; Crockett, 1955) were explored, but either these methods were found to converge much slower or they were overly sensitive to initial guesses.

Back-Propagation and Madaline I11 Training Algorithms. Both back-propagation and Madaline I11 are useful for training multilayered networks. A gradient-descent technique for minimizing a sum of squared errors is employed in both of these training routines. Gradient-descent techniques function by finding the lowest error point on any error contour in the weight space. This is accomplished by descending the slope or gradient of the contour plane. The publication of the back-propagation technique by Rumelhart and McClelland (1986) has unquestionably been the most influential development in the field of neural networks in the past decade. This approach is composed of two stages. First, the contrihutions to the gradient coming from the different patterns are calculated by back-propagating the sum of the squared error signal. Second, these contributions are then used to correct the weights accordingly. During each training presentation, the back-propagation technique requires only one forward computation through the network, followed by one backward computation in order to adapt all the weights of the entire network. Hence, it is called a two-pass training algorithm. During the forward pass, random numbers are assigned to the input and output weights of the system. The values of the hidden neurons are then calculated by taking the sigmoidal of the sum, which completes the forward pass. To achieve the back-propagation, an error value is determined for each output from the calculated output and the target output. This sum of squared error is then propagated back through the network. During this back-propagation, the initial weights are adjusted accordingly to reduce the error. This completes the reverse pass.

Ind. Eng. Chem. Res., Vol. 34,No. 11, 1995 3977 Back-propagation uses a mathematical model of the network t o estimate the weight vector gradient. Connection weights are changed by values proportional to the following error gradient:

Table 1. Experimental Solvent Activities in Nonpolar Binaries" system

41

a1

cyclohexanelpolystyrene

0.5150 0.3630 0.3270 0.2340 0.1990 0.1960 0.1290 0.5170 0.5150 0.4320 0.3390 0.3290 0.2420 0.1980 0.1720 0.1410 0.9000 0.8005 0.8000 0.7500 0.7000 0.6500 0.6000 0.9000 0.8500 0.8000 0.7500 0.7000 0.6500 0.6000

0.9980 0.9670 0.9696 0.8947 0.8480 0.8442 0.7651 0.6784 0.7479 0.7910 0.8466 0.9128 0.9183 0.9601 0.9823 0.9827 0.995 0.989 0.983 0.975 0.963 0.949 0.930 0.995 0.989 0.983 0.965 0.963 0.949 0.930

(PM = 25.900)

Aw = - v (8E r) W

where Aw is the change to the weight matrix, 7 is the learning rate, and E is the sum of the squared errors. While back-propagation uses an analytical equation to govern the entire training process. Madaline 111, a network of multiple adalines, uses a more empirical approach (Andes et al., 1990). This approach reduces the system error at each instant in time. Thus, when an input is presented to the network, the resulting output is compared to a target using a sum of the squared error. This approach uses a perturbation method on each neuron's summation of inputs to determine the appropriate connection weight. A function called "perturb" is used in this algorithm. Perturb controls the size of perturbation added to the pre-neuron sum of each neuron. The effect of this added value is propagated through the network. If the network error is reduced as a result of this change, then the perturbation is accepted, and the connection weights are changed. If the error is increased by the perturbation, then a change in the opposite sense is made. The gradient can be used to optimize the weight vector according to the method of steepest descent:

(7) Here, As is the perturb, p is a parameter which controls stability and the rate of convergence, and A(ck)2/Asis the sample derivative. To some extent, the sample derivative A ( d 2 / A s in Madaline I11 is analogous to the analytical derivative (dEk/dsk)' used in back-propagation. Hence, these two training rules follow a similar instantaneous gradient and, thus, perform nearly identical weight updates. However, the back-propagation algorithm requires fewer operations to calculate gradients than does Madaline 111, since it is able to take advantage of a prior knowledge of the sigmoidal nonlinearities and their derivative functions. Conversely, the Madaline I11 algorithm uses no prior knowledge about the characteristics of the sigmoid functions. Instead, it acquires instantaneous gradients from perturbation measurements.

Results and Discussion Application of ANN for Predicting Polymer Activities in Polymer/Solvent Binaries. A list of the systems investigated in this work is given in Tables 1-3. These systems represent four nonpolar binaries, eight nonpolar/polar binaries, and nine polar binaries. These binary systems were recognized by Heil and Prausnitz (1966) as the ones which had been wellstudied for a wide range of concentrations. With welldocumented behavior, they represent a severe test for any proposed model. The experimental data used in this work have been obtained from Alessandro (1987). Alessandro's available experimental data were arbitrarily divided into two data sets: one for use in training the proposed neural network model and the remainder for validating the trained network.

cyclohexane/polystyrene

(PM = 440.000)

benzene/polystyrene

toluene/polystyrene

a

Experimental data reported in Alessandro (1987).

Neural Network Configuration. No algorithms are available for selecting a specific network structure or determining the number of hidden nodes, although Zurada (1992)has discussed several heuristic-based techniques. We used a feed-forward neural network model with a nonlinear optimization training routine in this study. The advantages of this training routine were compared with the other two training routines. The comparisons are discussed in the following sections. What evolved was a network with two input nodes: (1) for the binary type and (2)the volume fraction of the polymer, 41,in the binary. The one output node was the solvent's activity, al. Predictions of ANN with the Nonlinear Optimization Training Routine. To evaluate the reliability of the proposed neural network model, we have trained all the binaries in each of the three widely reported categories (as shown in Tables 1-3) together as one training. A 2-4-1feed-forward neural network architecture was used in the training for all three categories of polymer/solvent combinations. As mentioned earlier, the 2-4-1architecture implies 2 input nodes: namely, the binary type, and the volume fraction of the polymer, 41,in the binary. There were 4 hidden nodes and 1 output node, namely, the polymer activity, a l , in the binary. A schematic diagram of the weights updating procedure is shown in Figure 3. The neural network model training and testing results are shown in Figures 4-6 for all three categories of polymer/solvent binaries. As shown, the predictions by the neural network model gave extremely good agreement with the experimental data. The average sum of the squared error for all three binaries was 8.0 x lop4. From the tables and these figures, it can be concluded that the pure binaries exhibited higher activity than did the polar/nonpolar binaries.

3978 Ind. Eng. Chem. Res., Vol. 34, No. 11, 1995 Table 2. Experimental Solvent Activities in Nonpolar/ Polar Binaries" system rubberiacetone

rubbedethyl acetate

rubberimethyl ethyl ketone

polypropylene/diethyl ketone

polypropylene/diisopropyl ketone

polystyreneiacetone

polystyrene/propyl acetate

polystyrene/chloroform

(I

61

a1

0.154 0.132 0.084 0.053 0.045 0.3750 0.3030 0.2600 0.2060 0.1290 0.1200 0.0510 0.4210 0.3740 0.3240 0.2570 0.2000 0.1340 0.0900 0.0070 0.0100 0.0110 0.0220 0.0260 0.0330 0.0350 0.0430 0.0700 0.1210 0.1960 0.0150 0.0160 0.0350 0.0660 0.0970 0.1620 0.2300 0.3010 0.3540 0.1099 0.2174 0.3226 0.4255 0.2174 0.3226 0.4255 0.5263 0.6250 0.2174 0.3226 0.4255 0.5263 0.6250 0.7216 0.8163

0.955 0.917 0.772 0.600 0.533 0.988 0.932 0.604 0.832 0.684 0.647 0.373 0.990 0.973 0.955 0.603 0.838 0.709 0.577 0.090 0.167 0.209 0.289 0.344 0.374 0.395 0.412 0.565 0.734 0.887 0.134 0.149 0.236 0.421 0.521 0.699 0.812 0.876 0.923 0.608 0.800 0.872 0.900 0.672 0.792 0.864 0.896 0.700 0.360 0.512 0.632 0.744 0.816 0.864 0.896

Table 3. Experimental Solvent Activities in Polar Binaries" system poly(propy1eneglycol)/ methanol (PM= 1955) poly(propy1eneglycol)/ methanol (PM = 3350) poly(ethy1eneoxide)/chloroform

cellulose acetate/acetone

cellulose acetate/methyl acetate

cellulose acetateldioxane

cellulose acetate/pyridine

cellulose nitrate/acetone

Experimental data reported in Alessandro (1987)

Predictions of ANN with the Back-Propagation and Madaline III Training Routines. A 2-6-1 neural network architecture was needed for both of these training routines. Two additional hidden nodes were required with the gradient search based procedures. Since these training routines are extremely slow compared to the nonlinear optimization routine that was used, the error limits on the sum of squared error were for the termination conditions. The fixed at 1 x neural network model with back-propagation routine required 29 000 iterations, while the Madaline I11 approach required 57 000 iterations. Compared t o these two training routines, the nonlinear optimization rou-

cellulose nitratelmethyl acetate

(I

61

a1

0.8874 0.8315 0.6701 0.4899 0.8961 0.8087 0.7233 0.5746 0.7840 0.7140 0.6180 0.5200 0.4950 0.4240 0.3270 0.2010 0.1060 0.0660 0.9409 0.9093 0.8761 0.8414 0.8049 0.7666 0.7262 0.6837 0.6388 0.9308 0.8945 0.8565 0.5992 0.7352 0.6916 0.6463 0.5992 0.8846 0.8440 0.8023 0.7594 0.7152 0.6698 0.6231 0.5749 0.8895 0.8504 0.8100 0.7683 0.7252 0.6807 0.6346 0.9495 0.9220 0.8930 0.8623 0.8296 0.7949 0.7579 0.7184 0.9406 0.9088 0.8755 0.8406 0.8040 0.7655 0.7251 0.6824

0.997 0.996 0.985 0.958 0.999 0.996 0.993 0.979 0.912 0.833 0.687 0.526 0.486 0.433 0.431 0.437 0.418 0.407 0.997 0.995 0.993 0.991 0.989 0.987 0.985 0.981 0.977 0.998 0.998 0.996 0.994 0.992 0.990 0.987 0.984 0.996 0.993 0.990 0.986 0.983 0.979 0.974 0.968 0.996 0.990 0.982 0.972 0.959 0.943 0.925 0.999 0.998 0.996 0.993 0.988 0.977 0.946 0.884

0.999 0.997 0.993 0.987 0.977 0.955 0.916 0.842

Experimental data reported in Alessandro (1987).

tine required less than 100 iterations. The training and testing results for these two algorithms are shown in Figures 7 and 8 for the polar binaries only. A meansquare error progression during the training phase is shown in Figure 9 for the Fletcher nonlinear optimiza-

Ind. Eng. Chem. Res., Vol. 34,No. 11, 1995 3979

Volume fraction

I

I I

I

Neural networkmodel

Neural network model predicted activity (all

I I

0

01

02

' I

01

04

05

07

06

08

09

1

Experimental solvent activity

i d

Figure 6. Feed-forward neural network model training and testing results for solvent activity predictions in polar binaries.

Adjusted

on-linear optimization routine

Experimental determined activity ( a l )

I- input layer H-hidden layer

0-output layer B- bias neuron

Figure 3. Schematic diagram of inputloutput variables in the neural network model and weights updating procedure during neural network training. 0

01

02

01

04

05

06

07

08

09

I

Experimental solvent activity

Figure 7. Feed-forward neural network model training and testing results with back-propagation training for solvent activity predictions in polar binaries (learning parameter 7 = 0.1).

E x p e m t a l solvent

Experimental solvent activity

Figure 4. Feed-forward neural network model training and testing results for solvent activity predictions in nonpolar binaries. 0

01

02

01

04

OS

06

07

08

09

I

Experimental solvent activity

Figure 8. Feed-forward neural network model training and testing results with Madaline I11 training for solvent activity predictions in polar binaries (learning parameter 7 = 0.1 and perturb As = 0.1).

0

0.1

0.2

03

04

0.5

0.6

0.7

0.8

09

I

Experimental solvent activity

Figure 5. Feed-forward neural network model training and testing results for solvent activity predictions in nonpolar/polar binaries.

tion routine and a traditional conjugate-gradient search routine. Convergence of the Fletcher routine was within 35 iterations compared to several thousand iterations required for gradient search based back-propagation and Madaline I11 routines. Such comparisons of the proposed training routine were reported for predicting densities of high molecular weight esters in our earlier publication (Ramesh et al., 1995).

Conclusions ANNs are capable of handling complex and nonlinear problems, can process information very rapidly, and

are capable of reducing the computational effort required in developing highly computation intensive models or for reducing the time-consuming efforts needed for finding functional forms for empirical models. The proposed neural network model with the nonlinear optimization routine is computationally simple, and it does not require the choice of additional parameters such as 7 and a. Yet it is very efficient for predicting thermodynamic properties like the activity of the polymer in the solvent binaries. ANN as a predictive tool is most effective only within the trained range of input variables. Those predictions which fall outside the trained range are considered questionable. Whenever experimental data are available for validation, neural nets can be put to effective use. The application of neural nets as a predictive tool for thermodynamic and other fluid properties is therefore very promising and deserves additional investigation. Investigations are underway which address prediction for vapor-liquid equilibrium in different types of systems.

3980 Ind. Eng. Chem. Res., Vol. 34, No. 11, 1995 61

" I

0

5

10

15

20 25 jo 35 Number of iterations

-Fletcher routine -A-

40

4j

5b

Gradient search

Figure 9. Comparison of mean squared error progression for Fletcher nonlinear optimization and gradient search routine during neural network training.

Acknowledgment We are indebted to a referee for detecting an error in the published literature data used in the original manuscript. We also extend our sincere appreciation to other reviewers for many fruitful suggestions to the proposed model in making this an efficient approach.

Literature Cited Alessandro, V. A simple modification of the Flory-Huggins theory for polymers in non-polar or slightly polar solvents. Fluid Phase Equilib. 1987, 34, 21-35. Andes, B.; Widrow, B.; Lehr, M.; Wan, E. Proc. Int. Joint. Conf. Neural Networks 1990,1, 533-536. Battiti, R. First and Second-order methods for learning: between steepest descent and Newton's method. Neural Comput. 1992, 4 , 141-166. Bhat, N.; McAvoy, T. J. Use of neural nets for dynamic modeling and control of chemical process systems. Comput. Chem. Eng. 1990, 14, 573-583. Booth, A. D. Numerical Methods; Academic: New York, 1955; pp 9, 158.

Box, G. E. P. Some notes on non-linear estimation. Statistical Technical Group Report, Princeton University, 1956. Box, G. E. P.; Wilson, K. B. J . R. Stat. SOC. 1951, 13 (Series B, NO. 11, 1-45. Crockett, J. B.; Chernoff, H. Pac. J . Math 1955, 5, 33-50. Fletcher, R. AERE-R 6799; Theoretical Physics Division, Atomic Energy Research Establishment: Hanvell, Berkshire, U.K., 1971. Flory, P. J. J . Chem. Phys. 1942, 10, 51. Flory, P. J. Principles of Polymer Chemistry; Cornel1 University: London, 1953. Flory, P. J.; Ellenson, J. L.; Eichinger, B. E. Thermodynamics of mixing of n-alkanes with polyisobutylene. Macromolecules 1968,1,279-286. Heil, J. F.; Prausnitz, J. M. Phase Equilibria in Polymer Solutions. AlChE J . 1966,2,678-685. Kontogeorgis,G. M.; Fredenslund, A. Tassios, D. P. Simple activity coefficient model for prediction of solvent activities in polymer solutions. Ind. Eng. Chem. Res. 1993,32, 362-372. Ramesh, K.; Tock, R. Wm.; Narayan, R. S.; Bartsch, R. A. Neural Network Based Model Approach For Density of High Molecular Weight Esters Used as Plasticizers. Adv. Polym. Technol. 1995, 14 (NO. 31, 215-226. Rumelhart, D. E.; McClelland, J. L. Parallel distributed processing: extrapolations in the micro structure of cognition. Psychological and Biological Models; MIT Cambridge, MA, 1986; VOl. 2. Venkatasubramanian, V.; Vaidyanathan, R.; Yamamoto, Y. Process fault detection and diagnosis using neural networks. Comput. Chem. Eng. 1990,14, 699-712. Watanabe, K.;Matsuura, I.; Abe, M.; Kubota, M.; Himmelblau, D. M. Incipient fault diagnosis of chemical processes via artificial neural networks. AlChE J . 1989, 35, 1803-1812. Zurada, J. M. Introduction to Artificial Neural Systems; West: New York. 1992. Received for review November 15, 1994 Revised manuscript received May 31, 1995 Accepted J u n e 14, 1995@ IE940679D

@

Abstract published in Advance ACS Abstracts, September

15, 1995.