Predictive control of quality in a batch manufacturing process using

Predictive control of quality in a batch manufacturing process using artificial neural network models. Babu Joseph, and Frieda Wang Hanratty. Ind. Eng...
0 downloads 0 Views 2MB Size
Ind. Eng. Chem. Res. 1993,32, 1951-1961

1951

Predictive Control of Quality in a Batch Manufacturing Process Using Artificial Neural Network Models Babu Joseph’ and Frieda Wang Hanrattyt Chemical Engineering Department, Box 1198, Washington University, 1 Brookings Dr., St. Louis,Missouri 63130

Artificial neural networks (ANN) can be used to generate models of batch processes relating product quality to process input variables and processing conditions used. Such models can be used in a noinlinear model predictive scheme to control product quality. Intermediate measurements taken while batch processing is in progress can provide feedback correction to compensate for modeling errors and unmeasured disturbances. This paper presents an architecture for a shrinking horizon model predictive control of batch processes using ANN models, and its application to a simulated autoclave curing process for composite manufacturing. In addition, the concept of incremental learning using ANNs to provide on-line adaptation to changing processing conditions is also explored in this paper.

Introduction Quality control is an important issue for the manufacturing industry. Quality control is particularly difficult in batch manufacturing processes, because the nonlinear dynamic nature of these processes renders linear model based control methods unsuitable. Often, batch processing is based on the concept of a fixed recipe that describes the sequence of steps taken in time to achieve a product with desired properties. Product quality variations occur because of changingraw material properties, gradual changes in processing equipment, and changes in environmental variables (such as air temperature, humidity, coolingwater temperature, etc.) Batch processes are often not well understood (meaningthereby that good phenomenological models may not be readily available). The operators have to rely upon secondary measurements to monitor and control the process since the end product quality is not available until the batch is finished. Usually, an operator relies on his experience and understanding of the process to take corrective actions in the middle of a batch if he anticipates a quality problem at the end of the batch. Many batch processes are carried out in a sequence of discrete steps, and events taking place in each step have an impact on the final product yield and quality. Recently, much has been written about the subject of statistical process control (SPC) and statistical quality control (SQC) (Klein, 1991;Levinson, 1992). These methodologies provide systematic ways of monitoring the operating status of the process. However, tools for helping the operator with his decision making process are still not fully developed. Usually, the feedback correction is left to statisticians who must carefullyanalyze the operational data to determine the cause of quality deviation and then correct it at the source. This procedure assumes that the operational data base is fairly complete to pinpoint the cause of the quality deviation. Lacking such data, one must often rely upon highly skilled operators or engineers to detect and correct the problem. Due to changes in operations from shift to shift and the differences in skill levels of operators, it becomes difficult to maintain consistent quality. Thus, there is strong incentive to develop control techniques that can provide automatic feedback correction for quality control. ~

~~

* To whom correspondence should be addressed. + Present

address: with DMC, Inc., Houston, TX.

The autoclave curing of compositesis an example typical of many batch manufacturing processes. This process consists of a batch reactor in which the end product quality depends strongly on the operating recipe employed. The physical and chemical phenomena involved are complex and not well understood as of yet. Past experienceprovides a workable recipe for operation of the autoclave, and this fixed recipe is then used in production. Unfortunately, the recipe does not always work well, leading to significant economic losses resulting from off-spec (unusable) end product. There has been a major thrust recently to improve the yield of good product through systematic process modeling and on-line feedback control techniques (Campbell et al., 1988). In this article, we present a methodology for control of batch processes such as the autoclave curing process. The method is based on the model predictive control (MPC) architecture (Prett and Garcia, 1988). The models employed are derived by regressing past operational data using feedforward artificial neural network structures (Rumelhart et al., 1986). Such models offer the advantage of capturing the nonlinear dynamics inherently manifested in such batch processing. Together, the ANN models and MPC algorithm offer a paradigm for imitating, at least in part, the role of skilled operators who learn from past operational history and use it to make feedback control decisions during batch processing.

Autoclave Curing of Composites The autoclave process is used to manufacture fiberreinforced polymer composite materials. In this process, fiber matsimpregnated with unreacted thermosetting reain (called prepregs) are laid up to a tool and covered with a variety of materials, and the setup is put into a large autoclavefor processing. A schematicof the layup is shown in Figure 1. By proper manipulation of the pressure and temperature profiles in the autoclave in a timely manner, the resin is heated, melted, and allowed to flow into the desired shape. Excess resin is squeezed out of the laminate by applying pressure. By maintaining a high temperature for a period of time, the resin is cured to produce a hard, tough laminate. The quality of the final product is measured in terms of the fiberhesin ratio, thickness, the extent of resin cure, and the presence of voids in the laminate. Because of the strong interactions between the effects of the autoclave temperature, the

0888-588519312632-1951$04.00/0 0 1993 American Chemical Society

1952 Ind. Eng. Chem. Res., Vol. 32, No. 9, 1993 Nylon Vacuum Sag Glasd”irtm2

Glass Breather Caulk Plate

3

Porous Release Cloth

Aluminum Darn

Tape Seal

Figure 1. Composite laminate cure setup.

Standard Curing Cycle Tarnperature (K)

Praaaura

(atrn) /8

17‘

330r

280L-70



2

4

I

6

1‘

’ ‘0 8 1 0 1 2 1 4 I

I

I

Time (seconds) (E-3)

Figure 2. Typical cure cycle used in the autoclave.

autoclave pressure, and the raw material properties, it is difficult to maintain consistent quality in production. Figure 2 shows a typical cure cycle used in the autoclave. The process can be divided into four phases as shown, namely, the first heating phase (phase l),the first holding phase (phase 21, the second heating phase (phase 3), and the second holding phase (phase 4). The temperatures shown are the set points for the PID controller for the temperature measured at the bottom of the laminate. The pressure is another manipulated variable, and it is increased through a step function a t some point during phase 2. The idea of applying the pressure is to consolidate the fiber and squeeze out excessresin. It is important to apply the pressure when the viscosity of the resin is at its lowest point (after the resin softens and melts, but before it starts to react and cure). Unfortunately on-line viscosity measurements are difficult to achieve, and one must rely on temperature measurements most of the time. There are infinitely many cure cycles that can be employed. Typically cure cycles are determined from past experience with a particular resin and fiber combination. This cure cycle is then implemented using programmable logic controllers (PLCs). Operation,then, is based on fixed trajectories in time, regardless of variations in raw material properties and geometry. The operator is expected to take care of unforeseen variations. Variations in the properties of raw materials and part geometry (i.e., the shape of the laminate being produced) would require modificationsto the cure cycle. Some efforts have been made to incorporate a control system using qualitative processing knowledge in the form of rules (LeClair et al., 1987;Wu and Joseph, 1990). However, the

major drawback with these “expert systems” approaches has been in the acquisition of processing knowledge. On the basis of past experience, the following variables are considered to be important in determining the quality of the final product: (1)weight fraction of resin in the prepreg; (2) initial temperature of the prepreg; (3) impurities in the prepreg; (4) the pressure applied in the autoclave; (5)the first holding temperature; (6)the second holding temperature; (7) the time at which pressure is applied. There are numerous other variables that hava a correlation with the product quality, but for the work presented here, we will be using the above set of disturbances and manipulated variables.

Neural Network Models for Batch Processes An approach that has evolved as a powerful computational paradigm in the past few years is artificial neural network models, or simply “neural nets” (NN). ANN models, also known as parallel distributed processing models, represent an important information abstraction in modeling intelligence in process control (Prett and Garcia, 1988). In chemical engineering neural networks have been applied to solve problems in fault diagnosis (Hoskins and Himmelblau, 1988; Venkatasubramanian and Chan, 1989; Leonard and Kramer, 1990),sensor data interpretation (Whitely and Davis, 1990), and process control (Pao, 1988;Bhat and McAvoy, 1989; Ydstie, 1990; Psichogios and Unger, 1991). Among the many different architectures proposed for neural networks, the multilayer perception with back propagation training algorithm (Rumelhart et al., 1986) has been used most frequently for modeling in process control applications. These networks have one input layer, two or more hidden layers, and an output layer. The number of neurons used in the hidden layers is one of the architectural parameters. For a summary of alternative ANN configurations and architectures that have been employed in control, see the recent book by White and Sofge (1992). Batch Process Modeling. Batch processes are inherently dynamic systems, and in chemical engineering, batch processes tend to be nonlinear. A product made from a batch may go through processing in many units in a stepwise manner. From the point of view of quality control, the issue is to be able to have a model that will predict the outcome of the batch (i.e., product quality) in terms of the input and processing variables. Let us focus our attention on a batch process carried out in a single processing unit (the ideas are easily extended to batch processes carried out in multiple processing units). The variables associated with the process can be categorized as follows: (1)the initial state of the system, xo; (2) input disturbances, d(t); (3) manipulated inputs, rn(t); (4) intermediate measurements, y(t). A processing model based on the physics and chemistry of the process may be written as dx/dt = f(x(t),d(t),u(t))

x(0) =: xo

Here x is the state variable vector and Q represents the final product qualities of interest. Unfortunately, for many batch systems, such models are difficult to develop. Often, the system is not understood well enough to build a first principles model. The resulting model may be complex and not suitable for on-line control application.

Ind. Eng. Chem. Res., Vol. 32, No. 9, 1993 1963 disturbance

I

phase 1

I TIME

phase2

1

phase3

________.__ >

Figure 3. Disturbance variable.

For continuous process systems, one depends primarily on empirical models for control system development and implementation. For example, the widely employed model predictive control schemes use linear input/output models developed via empirical identification steps conducted on the actual plant. For batch processes, linear input/output models cannot be used because of the time-dependent nature of the process and the extreme variations in states that occur during the processing. Hence one must employ a nonlinear process model. Artificial neural networks are ideally suited for modeling nonlinear systems. One issuethat arises here is the type of ANN architecture that is used. For modeling of dynamic systems, recurrent networks are usually suggested. Recurrent nets are useful if we want to follow the time trajectory of state variables. In the problem described here, the end product quality is the key variable, and the time trajectory of state variables is less important. For predicting the end product quality, we can use simpler feedforward network architectures. This simplification was crucial to obtaining a practical solution to this problem. A feedforward (multilayer perception) neural network model can be formulated to predict the product quality as follows:

8 = H(xo,dl,d2,...,d,,ul,u2,...,~~)

(2)

where H is the abstract representation of the neural network model, di is the disturbance value at some time during the batch (i = 1,..., n),and Ui is the manipulated input a t some time during the batch (i = 1, ..., m). One approach would be to define di = d(iAt)and choose a suitable sampling time, At. However such an approach can easily lead to a network with a large number of input nodes especially if the batch processing time is long. The larger networks take longer to train and hence it is desirable to reduce the size if possible. More often, such discretization in time may be unnecessary, especially if the variable of interest is time-invariant for significant periods during the batch operation. For example, consider the trajectory of the pressure controller specified in the recipe for the autoclave curing process shown in Figure 2. Clearly to characterize this path, one needs only to specify two inputs to the neural network model: the time at which the pressure is applied and the pressure applied. Similarly, one can reduce the number of parameters needed to specify the temperature trajectory. Next, consider the variation of a disturbance variable that may vary rather arbitrarily in time as shown in Figure 3. In this case, if the variations are slow, we could employ a coarse grid in time for the disturbance variable. If it is known that the disturbance values a t certain points in time during the processing are most important to (most

correlated with) the product quality, then the disturbances at these points in time may be selected as the inputs to the neural network. An important issue with many chemical processes is the presence of modeling errors and unmeasured disturbances. Unmeasured disturbances can affect the quality but are not included in the model proposed above. In model predictive control of continuous processing, effect of modeling errors and unmeasured disturbances is accounted for by using a feedback correction to the model prediction every time a measurement becomes available. In batch processes the quality can be measured only a t the end of the batch. On the other hand, certain intermediate measurements provide an indirect way of assessing the extent of other external disturbances and modeling errors and hence allowsfor a possibility of anticipating the effect of these disturbances and correcting for them. Another issue here is the choice of measurements and disturbance variables that are used as inputs to the ANN model. A detailed study was conducted to select variables and measurements that are most correlated with the product quality. The methodology used and results are described in a related paper by Shieh et al. (1992). For the autoclave process the following measurements are highly correlated to the size of voids found in the composite laminate produced: (i) temperature at the top of the laminate at the end of phase 1; (ii) maximum temperature in the laminate during phase 2; (iii) temperature at the top of the laminate at the end of phase 3. These measurements are functions of the manipulated input variables, but they are also influenced by input disturbances that are not measured. Hence, by including them as inputs to the network, we have a way of incorporating a certain amount of feedback correction into the control system. Using this approach we can create another type of model for the batch process using the network

8 = H(~O,dl,d2r...,dn,ulr~~,"',~m,~l,y2,...,y~) (3) Where y1, y2, etc., are the intermediate measurements. At the beginning of the batch, none of the disturbances di or the measurements yi are available. After the batch is in progress for a while, some of these measurements will become available. How can we use such a model when the measurements are not available? There are many alternative possibilities here. One could conceivably build multiple neural networks for different phases in the batch with each network using only the measurements made prior to that phase. Another alternative is to use estimated values or use default values of these measurements (perhaps from some earlier batches) as inputs to the network until such measurements become available. The latter approach was employed in this work. An evaluation of the capability of artificial neural networks to model the autoclave process was conducted. The work compared neural networks with other quadratic regression models. The results were published in a related article (Shieh et al., 1992). The neural network models were found to be superior in terms of accuracy and tolerance to noise in the measurements. Control Using Neural Network Models: Neurocontrol Pao (1988), Bhat and McAvoy (1989), Psichogios and Unger (1991), and Ydstie (1990) are among many researchers who have investigated the use of ANN in process control. A recent handbook by White and Sofge (1992) and an article by Hunt et al. (1992) give good reviews of the field.

-+v[’.

1954 Ind. Eng. Chem. Res., Vol. 32, No. 9, 1993 I

x(t)

.. ... ..

u* (t)

x(t-1)

___D

-+

u(t)

Figure 4. Architecture of Supervised neurocontrol.

There are four basic neurocontrol architecturesthat have been employed by researchers in the past. Each architecture has its advantages and disadvantages. An architecture is applicable in certain situations for solving certain types of problems. We briefly discuss each one below: Supervised Control. In supervised control architecture, the neural network is given a set of training examples in the form of an input vector of known situations and its corresponding output vector of best control actions to take in that situation. These examples may be generated by an expert operator or selected from past historical records. The network learns what control action is appropriate in each situation. Figure 4 shows the architecture. There is only one neural network in this architecture, and it does not require a separate model of the process. The disadvantage of this approach is that its performance is almost never better than that of the operators from whom the network “learned”, since the performance of a neural network is only as good as its training set. There is also no feedback correction mechanism for unmeasured disturbances. Inverse Dynamics. In inverse dynamics, the state of a process x ( t ) is assumed to be a function of the current actions u ( t ) and the prior state of the process x(t - 1). Therefore, the relationship can be written as

= F(u(t),x(t- 1)) (4) It is not necessary to know what F is; however, it must be assumed that F is invertible as a function of x ( t ) . That is u ( t )can be solved as a function of x ( t ) , for any& - 1). x(t)

The training phase is usually distinct from the actual application of the trained network. The training set contains examples of the x ( t ) which resulted from actual u ( t ) that were tried out in some experiments. These examples are used to adapt the weights and thresholds of a neural network H such that u(t>= H(x(t),x(t- 1)) (5) That is, His the inverse of F. After having been trained, the network can be applied to a process. In the application phase, the network is used in a different way. Suppose that x * ( t ) is the desired trajectory for the states of the process. The network H takes x * ( t ) and x ( t - 1) as its inputs and gives a control vector u ( t ) as its output which will lead the process to the desired trajectory. This approach is shown in Figure 4. Again, the dotted line indicates feedback used to adapt the weights and thresholds of the network. This approach is similar to the supervised control approach in that they are both suitable for regulatory control of continuous processes. The adaptive critic family of neurocontrollers includes all the designs that consider the problem of reinforcement learning over time. These designs include an action network and an evaluation network (seeFigure 6). The action network learns to select control actions as a function of the process states. The evaluation network (also called the critic) evaluates the results produced by the action network, so as to permit adaptation of the action network. These architecture were proposed originally by Werbos (1974,1992)and later used by Anderson (1989,1990),Barto et al. (1983) and White and Sofge (1992). These adaptive critic approaches require on-line training. Thus it is useful

. . . . . Actual u(t)

Predicted u(t)

(a) Training Phase

Desired x(t)

Predicted u(t)

x(t-1)

(b) Application Phase

Figure 6. Architecture of inverse dynamics network. system states

Evaluation

1

I

Figure 6. Architecture of classical adaptive critic neurocontroller.

r -1 >

1

Process Model

I Updated Output Trajectory I

Redicted Output Trajectory



Control Move Calculations

i7Control

I I

Action

Current Output Measurement

I Process I

Figure 7. Structure of model predictive control.

if on-line training is feasible, fast, and inexpensive. Such on-line training will disrupt the normal manufacturing environment and may even raise some safety concerns. Over the past 10 years, model predictive control has become the standard procedure for control using a process model. In this architecture, an on-linemodel of the process is used to predict the outcome of future control actions. The future control actions are selected such that the predicted outcome of the control actions is optimum in some sense (such as minimizing the square of the deviation from a set point trajectory over a finite horizon into the future, subject to process constraints). Although, the methods were initially built around linear models, it can be extended to nonlinear models as well (Joseph et al., 1986, 1987). Figure 7 shows the structure of a model predictive controller. The controller stores the effect of past control actions on the controlled output. From the current measurement, it estimates the current disturbance effect. The process model is used to estimate the future trajectory given the estimate of the disturbance and the proposed control actions. The controller adjusts the control actions such that the predicted output followsa desired trajectory. Psichogios and Unger (1991) have adapted this structure to use ANN models. An application to continuous process control is also shown in the same article. Among the four neurocontrol paradigms discussed in this section, the supervised neurocontrol and inverse dynamics are both limited by the training sets available for learning and primarily suited for continuous process. The adaptive critic approach requires on-line training which may upset the manufacturing process. The model

Ind. Eng. Chem. Res., Vol. 32, No. 9, 1993 1955 predictve control strategy is capable of dealing with control situations not previously encountered. The ability to incorporate on-line measurements so that unmeasured disturbances can be taken into account is another advantage offered by this architecture. For these reasons, we selected the model predictive control strategy as the basis for building the architecture of the neurocontroller for batch processes.

I

Optimization: Future Control Move Calculations Vanabler

Architecture of Proposed Batch Neurocontroller

R e w d R s i p c fa I FunucConuoI

lnDmledllU

The objective of the batch neurocontroller (aswe define it) is to make adjustments to the operational plan (or recipe) in order to achieve the specified product quality at the end of the batch. The intention is to nullify the effect of feed variations, unmeasured disturbances, and variations in processing equipment characteristics. Let us say that we have a neural network model of the batch process available. Then the model predictive controller can be adapted to form a batch neurocontroller as follows: (i) Model prediction: use the measurement of initial state, past measurements, estimated values of future measurements, past control actions, and proposed future control actions to predict the expected quality of the product. (ii) Optimization: If the predicted quality is not within the target, make adjustments to the proposed future control actions so that the predicted quality is within target and processing constraints are not violated. Mathematically this may be stated as follows: Let t = 0, denote the start of the batch. At any time t, we can divide the measured disturbances, control actions, and process measurements each into two categories: d = [di,dJ

(6)

where dl, u1, and y1 represent actual values of past measured disturbances, control actions, and process measurements. d2 and y2 are estimated values of future disturbances and process measurements. (In the absence of any data these may be set equal to their default values obtained from the last batch completed or to the expected mean value of these measurements.) u2 are the as yet undetermined values of future control actions. Let

4 = H[~o,di,d2,~1,~2,~1,~21 (9) represent the ANN model of the process. Then the control problem can be stated as min

(8, - 4)'

(10)

U2

subject to Q = H[x,,

...I

If there are constraints present, these may be added to the optimization problem. Note that the optimization problem is constantly changing during the batch because of the arrival of additional measurements and because of the decreasing length of the vector u2. (As time progresses fewer variables

Mururcmcnn

Process

I

Figure 8. Structure of the batch neurocontroller.

can be changed to affect the quality outcome of the batch.) Thus we have a prediction horizon that is shrinking with time as the batch progresses. For this reason we have coined the term shrinking horizon model predictive control to describe this control strategy. From a practical standpoint it is convenient to divide the total batch cycle time into phases and then apply the control algorithm once in each phase. Figure 8 shows the schematic of this proposed batch neurocontroller.

Application to Autoclave Curing In this application study, an autoclave simulation model developed by Wu (1989) was used to evaluate the performance of two model predictive neurocontroller architecture (on-lineand off-lineneurocontrollers). The control objectives in the autoclave processing are 2-fold: (i) keep the final product thickness of the cured laminate as close as possible to the specified target thickness; (ii) keep the maximum void size in the cured laminate below a specified maximum value. The architectures of the on-line and off-line batch neurocontrollers are described next. Off-LineNeurocontrol. Off-lineneurocontrol is "offline" in the sense that control strategy is determined offline before the cure is started. In this implementation, the neural network model was generated using only the initial state, input disturbances (measured at time t = 0) and manipulated inputs as follows:

Q = HExo,ul

(11)

Thus there were no intermediate secondary measurements incorporated into the model. Such a model does not require on-line feedback measurements. It is exercised only once: At the start of the batch. It predicts the optimum (recipe) control inputs for the given input conditions. No change is made in the recipe in the middle of the batch. The architecture of the off-line ANN model is shown in Figure 9. In this model the three manipulated inputs are the applied pressure, the first holding temperature and the second holding temperature (see Figure 2). The others indicate the initial state of the system and the disturbance variable.

1956 Ind. Eng. Chem. Res., Vol. 32, No. 9, 1993

OIPrI6 2000 I tp, I 6 0 0 0 360 I T, I 4 1 0 420 I T2I 4 6 0

subject to V , < 0.001 (V, = maximum void size) 0 I T, I 4 1 0 OIF'rI6 420 I T2I 460

[V,,thl = H[Wt,T&bpJ+,T1,T21 This optimization problem was solved using a standard reduced gradient optimization package. The optimization was performed at the beginning of each batch to determine the best curve cycle for that batch. The resulting cure cycle was then run using the autoclave simulator to generate the result for that batch. On-Line Neurocontrol. On-line neurocontrol is "online" in the sense that the cure cycle is updated continuously using on-line measurements from the batch. In this case, the ANN model incorporates intermediate measurements:

Q =H[x,,d,~l (13) The batch processing cycle is divided into phases as shown earlier in Figure 2. In each phase, the control inputs for the remaining phases are calculated using the procedure described above. The measurements available in the prior phase are used as inputs in the ANN model. Estimates of measurements are used if the measurements are not yet available. In the application study reported here these estimates were obtained from a batch run using the standard cure cycle implementation. The on-line implementation is more complex but has the ability to correct for unmeasured inputs and modeling errors. The architecture of the on-line ANN model is shown in Figure 10. This network has four additional input nodes. These are (1)tpl, the time at which pressure is applied; (2) Tld, the temperature a t the top of the laminate at the end of phase 2; (3) T k ,the maximum temperature in the laminate during phase 2; and (4) T3d, the temperature at the top of the laminate 600 s before the end of phase 3. The first is an additional manipulated variable since it alters the cure cycle used. The remaining three are intermediate measurements taken while the batch is in progress. These variables were selected through a combination of engineering knowledge about the system, sensitivity to input disturbances, and correlation with the final product properties. The on-line control problem is stated as (atthe beginning of the batch) subject to

v,I O . 0 0 1

Since Tlend, Tzrnax,and T3end are not available at the beginning of the batch, default values were substituted. These default values were obtained from a nominal batch runresults. At the end of phase 1the optimization problem was run again using the measured value of Tlend instead of the estimated value. A t the end of phase 2, the problem was solved again after removing TI as an independent variable. A similar pattern of was repeated at end of each phase.

Results The current manufacturing procedure of the autoclave curing process is to follow a standard cure cycle, which does not take into account the different disturbances affecting each individual run. The neural network model, in either the off-line neurocontroller or the on-line neurocontroller, takes into account some of the disturbances within the model. For example, both network models ahve the raw material properties as an input, so the neural networks can predict the outcome when these properties change. However, the neural network models do not include every possible disturbance. In this section, performance of the neurocontrolled in situations with both measured and unmeasured disturbances will be discussed. The study was conducted as follows: (i) First two different ANN models were trained, one using no intermediate measurements and the second using intermediate measurements. The training set consisted of data from test runs made using the standard cure cycle. Details of the modeling results are given in the related publication by Shieh et al. (1992). (ii) New batches were run in three modes: (a) using a standard cure cycle; (b) using a cure cycle determined by the off-line neurocontroller; (c) using a cure cycle that is determined by the on-line neurocontroller. In this case the cure cycle is modified in each phase of the batch operation as identified in Figure 2. Performance with Measured Disturbances. To investigate the performance of the neurocontroller with the presence of measured disturbances, 37 sets of raw material properties were generated. Each set contains three values, one for each of the three properties. Values for each property are randomly generated within the feasible range for that property. The 37 sets of raw material properties are used as 37 cases to evaluate the performance of the neurocontroller when applied to the simulated autoclave curing process. The performance of the neurocontroller is evaluated by how well it controls the quality of the cured composite laminate. The goal is to keep the thickness of the laminate at 1.68 cm, and the maximum void size in the laminate below 0.001 cm in diameter. The results are shown in Figures 11 and 12. Figure 11 shows a comparison of the product thickness distribution using the standard cure cycle, the off-line neurocontroller alone, and the off-line followed by the online neurocontroller. Each point represents the resulting laminate thickness for one batch of the raw material properties. The mean and standard deviation of the product thickness for the 37 cases are also reported on the

Ind. Eng. Chem. Res., Vol. 32, No. 9,1993 1957

With Off-line Neurocontroller

Standard Curing Cycle

With On-line Neurocontroller

Tklckneu (em)

. . 2

mein.1.7818 a.d.mO.Oea7 1.9

1.8

.

-

. 9

. . . I -

=

2mein-1.6888

mrin-l.BO91 8.d.-0.0284

..... .. ..

8.6.-0.0102

1.9-

1.8

1.9-

.. . . .

-

8

I

..-0

5

10

15

20

25

30

35

1.61 40

"

0

5

10

'

15

'

20

25

'

30

'

35

1.8

.. I

40

-

.

*7*.-m . mw . ma .a.

1 . 6 " " " " 0 5 10 15

Batch Number

Batch Number

w.mm.wm . r-.nw.m .m l

20

25

30

35

40

Balch Number

Figure 11. Comparison of product thickness distribution for batch runs with measured disturbances only.

Standard Curing Cycle M u h u m Vold Ske (ern)

Yaxknum Void O h (ern)

Ywrlmum Vold SIN (em)

0.04

0.04 main-0.0152 ern S.d.-0.0122 ern

0.03

t

0*02

. .. . '. . ........... .

0.01r. .a

0.03

-

0.02

-

mein-0.0000 Ern s.d.-0.0022 ern

main-0.0013 ern 8.d.-0.0055 ern

.

"

-0

-0.01'

'

'

'

'

'

'

'

-0.01

Figure 12. Comparison of maximum void size distribution for batch runs with measured disturbances only.

figure for each of the three cases. Figure 12 shows the similar comparison of the maximum void size distribution. It can be seen from Figure 11 that the laminate thickness obtained by using the standard cure cycle can deviate significantly from the target value. The off-line neurocontroller by itself can greatly reduce the deviation, but the results are even better with the on-lineneurocontroller. It is interesting to note that the only point which differs significantly from the target value in the graph with the on-line neurocontroller is the thickness for case 18. This thickness also deviates significantly from the target value in the other two graphs (with the standard cure cycle and with the off-line neurocontroller alone). This suggests that with this case of raw material properties, the conditions required to maintain the product thickness are significantlydifferent from those used to obtain the model. Figure 12 shows that the maximum void size in the laminate cured by the standard cure cycle can also deviate significantly from the target value. Both the off-line neurocontroller and the on-line neurocontroller reduce the deviation significantly. It is interesting to note that for case 29, the performance of the on-line neurocontroller is worse than the performance of the off-lineneurocontroller. Possibly the error in prediction for this case was smaller when using the off-line model. When using empirical models such as ANN models it is possible that the off-line model may outperform the on-line model occasionally,

but the selection should be based on the average performance rather than isolated incidents. The reader may wonder why the error in thickness is always positive. This is because the set point for the thickness is set close to the lowest possible value for this laminate (reached when the fibers are compacted to the point of touching each other). Thus, all disturbances tended to give product thicknesses greater than the set point. Figure 13 shows a comparison of the standard curve cycle and the cycles determined by the neurocontroller for case 16. The thickness and maximum void size values obtained from these cure cycles are also reported therein. The figure shows that the off-line neurocontroller lowers the product thickness and the maximum void size by applying a higher autoclave pressure, and the on-line neurocontroller lowers the two properties by applying a higher pressure earlier during the process and by significantly lowering the cure temperatures used. Figure 14 shows a series of the cure cycles issued by the on-line neurocontroller throughout the run for one of the batch runs. It is used as an example to show that the on-line neurocontroller revises the cure strategy at the end of each phase using newly available data. The new optimal cycle may or may not be the same as the one issued previously. In Figure 14, the cycle issued at the end of phase 1 reflects the changes introduced after Tlend

1958 Ind. Eng. Chem. Res., Vol. 32, No.9,1993

Cure Cycle Determined by Cure Cycle Determined by the Off-line Neurocontroller the On-line Neurocontroller

The Standard Cure Cycle T I I p I I ( w 110

R w n 1.0)

"..".......*-

'

R n u n la4

hlpr*WlKl

l h v n 1-1

T m p r N n (K)

8

6

..........

.......... ................

0

2

4

6

8

1 0 1 2 1 4

0

2

6

4

8

1 0 1 2 1 4

Time (seconds) ( E 4 Time (rconds) ( E 4 Figure 13. Comparison of cure cycles for batch 6 (with unmeasured disturbance).

Pressure Cycler

Temperature Cycles M n C l

M wwd

I*DIm.">

*-..-.,.-, -.

.-...-. .-...-.

,To

340

mo 0

*

4

.

I 1 D I I t l

I

m. (m*""nd. ur,

2

I

B

10

(2

(4

n m mwuund. -.I

Figure 14. Cure cyclea eomputadby the on-line neuroeontmller for hatch 16 (with unmeasured disturbance).

became available, the cycle issued at the end of phase 2 reflects the availability of T2max, and the cycle issued a t the end Of phase the Of T2mmand the cycle issued at the end of phase 3 followsthe availability of Twd. Here, ?'lend, T-, and TSInd are intermediate measurements. thisexample, the optimal cure cycles do change throughout the run as a result of the intermediate measurements. performance in presenceof Unmeasured ~ i banees. addition to the measured disturbances mentioned in the last subsection, there are other disturbances that may also affectthe autoclave curing process. These include raw material properties that are not taken into account in the models, failed or inaccurate error in the adjustment of the manipulated heat loss to the environment,etc. These disturbances are notconsidered in the neural network models, so they are called unmeasured disturbances. In this subsection, the performance of the neurocontroller is-investigated with the presence of one type of unmeasured disturbancesadditional heat input to the process. This is done by increasing the actual heating rate used by a fixed percentage during the two heating phases of the cure cycle. Two levels of disturbances are considered, but results for only one level are shown here. In the operation of a real process, the additional heat input can he caused by malfunction of the heating elements or a change in the . heat of reaction of the resin used. Ten of the 37 cases of the raw materialpropertiesgenerated in the last suhsection are usedtoevaluate theperformanceoftheneurocontroller with the presence of additional heat input. The perfor-

""0

2

4

8

8

10

12

4

2

14

0

Time (seconds) (E-3)

mance is evaluated by how well it controls the quality of the product. The goal is again to keep the thickness of the laminate at 1.68 cm, and the maximum void size in the laminateatzero. TheresultsareshowninFigure15which shows a comparison of the maximum void size distribution with the three differentcontrol systems. The figure shows that the product quality obtained by using the standard curecycledeviatesverysignificantlyfromthe targetvalues when the process is subject to 20% additional heat input. The off-lineneurocontroller by itself reduces the deviation only slightly. However, with the on-line neurocontroller, the deviation is greatly reduced. The reason for the improvement is that the on-line neurocontroller issues new cure cycles throughout each run. A new cycle is issued whenever an intermediate measurement is obtained. The neural network model in the off-line neurocontroller is no longer accurate when 20% more heat is added to the process. Therefore, without the on-line detection and corrective action, this neurocontroller cannot perform well. A comparison of the standard Cure cycle and the cycles determined hy the neurocontroller for case 6 with 20% additional heat input shows that when compared with the standard cycle, theoff-line neurocontroller suggests acure cycle first ~ with ~slightly higher ~ ~ holding - temperature, higher applied pressure, and lower second holding temperature, while the on-line neurocontroller suggests application of an even higher pressure earlier during the process and a lower second holdingtemperature. It is interestingtonote that Unaware of the additional heat input, the off-line neurocontroller Suggests a cure cycle with a higher first holding temperature. As a result the performance of the off-lineneurocontroller is worse than that of the standard in this case. cure Figure 14 shows a series of the cure cycles issued by the on-line neurocontroller for case 6, when it is subject to 20% additional heat input. It can be seen that at the beginning of the run, the issued cure cycle is not much differentfrom thestandardcurecycle. However, thecure cycle issued at the end of phase 1is quite different from the previous one. This cycle suggests a much higher autoclave pressure being applied earlier~duringthe run. The changesareissued hecause the neukontrollerdetects a higher than expected process temperature from the available intermediate measurement, so it takes actions to cope with this situation. If the laminate temperature at the endofphase 1is higherthanexpected, theminimum

Ind. Eng. Chem. Res., Vol. 32, No. 9,1993 1959

Using Off-line Neurocontrol Using On-line Neurocontroller

Standard Cure Cycle 0.25 urn-4 m r

.

114ono

0.15

-

0.1

-

0.05

-

0.2

0

.

.

m

0.2-

0.2

. .

0 *.a td-0W1

0.15 I

, , , , , , , , , Batch Number

0.1

-

0.05

-

.a-o*Ou

1

. . . I ) -

I,

..n.oom

. . .

0.15

-

0.1

-

0.05

-

0

0-

Thickness (cm)

; f y y f ; f ; ; * ’ Batch Number

Figure IS. Comparison of maximum void size distribution for batches with an unmeasured disturbance of 20% additional heat input.

resin viscosity will occur earlier during the run. Therefore, it is reasonable to apply a higher pressure a t an earlier time to control the thickness and to keep the void size small. The cure cycle issued at the end of phase 2 and phase 3 suggests a lower second holding temperature. The results show that for cases with measured disturbances only, the on-line and off-line neurocontrollers performs very well; however, the off-line neurocontroller by itself performs poorly when unmeasured disturbances exist. The on-line neurocontroller, on the other hand, can handle both types of disturbances very well. This finding can serve as one criterion for deciding whether an on-line neurocontroller is necessary to control the process. For a process with negligible unmeasured disturbances, the off-line neurocontroller by itself is sufficient to control the process. However, if the process is subject to a considerable amount of unmeasured disturbances, the online neurocontroller must be implemented to get good control results. As mentioned earlier, the on-line neurocontroller is activated throughout a run whenever new intermediate measurements are obtained. The intermediate measurements reflect the current state of the process, so the controller can detect any deviation and take corrective actions. These measurements need to be selected carefully so that they correctly represent the process state. In this application, the feature selection conducted by Shieh (1992) was used to select the influential raw material properties. Other techniques that may help preprocess the operational data include principal component analysis and canonical analysis.

Adaptive Learning One important characteristic of neural networks that makes them a good candidate for process control applications is adaptive learning. In most articles, adaptive learning refers to retraining the neural network with a new set of examples without changing the architecture of the network. For process control, this type of adaptive learning is useful only if there is a detected change in the process, and the new training set is available. However, most of the time, the changes are gradual and hard to detect. For example, catalyst decays, equipment ages, sensors degrade, etc. The gradual changes can be incorporated only into a neural network model through continuous updating of the weights. This type of learning is

called incremental learning. Articles on incremental learning are rather limited in the literature. Related work includes papers by Dickinson (19891, Huang and Huang (1990), and Stevenson et al. (1990). In this section we investigate the capability of the ANN model to adapt to changes in the processing conditions. Two important factors that affect learning will first be discussed. They are the forgetting factor and the updating algorithm for the training set. The forgetting factor determines how fast a network model should forget old data. A neural network model forgets old data through retraining with an updated training set in which some old data is replaced with new data. Once the forgetting factor is set, one needs to determine in what manner the old data should be replaced, that is, to set criteria for selecting the old data appropriate for replacement. After discussing the two factors, the adaptation ability of a multilayer perceptron is evaluated in the context of the autoclave curing process. The Forgetting Factor. The “forgetting factor” method is widely used in different adaptive methods including adaptive control, adaptive filtering, adaptive signal processing, etc. In this approach, past process measurements are given an exponentially decreasing weight as specified by the equation j3(t,k) = X ( t ) t - k (15) where A is the forgetting factor, j3 is the forgetting profile, t is the current sample number, and k = t - 1, t - 2, t 3, t - 4, t - 5, etc., and they stand for previous sample numbers. More details regarding adaptive methods, forgetting profiles, and forgetting factors can be found in the book by Ljung (1987). The smaller A is, the more the old data will be discounted. For a continuous process that changes gradually, the most common choice is to take a constant forgetting factor, that is, X(7) = X (Ljung, 1987). Typical choice of X is in the range between 0.98 and 0.995. Experiencewith continuous processes can serve as a guideline for the control of batch processes, but some work still needs to be done to ensure a good choice of X for the particular batch process being controlled. Updating the Training Set. After the forgetting factor is set, one needs to determine the manner in which the training set is updated. This is done by choosing a criterion of selecting the old data in the training set to be replaced

1960 Ind. Eng. Chem. Res., Vol. 32, No. 9, 1993

with new data. There are a few ways to update the training set, and they will be described below: (1)Randomly replace a part of the training set with new data. This is the easiest way to update a training set. This type of updating does not consider any characteristics of the data including their age, which may be an important factor for incremental learning. Therefore, random updating is not very efficient for most processes. (2) Replace the oldest data in the training set with new data. When new data become available, simply add the data into the training set and delete the oldest data from the set. This type of updating takes into account the age of the data, and keeps the training set containing the most current data. The disadvantage with this method is that it does not consider the completeness and global nature of the training set. (3) Replace the oldest data that have similar operating conditions as the new data. The training set is divided into a number of regions, with each region containing data that cover a certain range of the operating conditions. When new data become available, add the data into the region that cover the operating conditions of the data and delete the oldest data in the same region. This updating method keeps the training set current as well as complete and representative. Selecting a suitable updating strategy is an important step toward successful incremental learning. One of the three methods mentioned above can be used to update a training set, or one may create an updating strategy that is suitable for the process being controlled. A Case Study. The autoclave simulator described earlier is presented with input prepreg material with drasticallydifferent properties than those used to generate data for the ANN model. Because of the significant changes in raw material properties, the ANN model is no longer accurate enough to describe the process, and the on-line neurocontroller fails. After 10 failed runs, the neurocontroller was turned off and an additional 10 runs were made in the open-loop to generate data for updating the ANN model. This is very important since otherwise we will only have correlated input data that is not desirable for identification purposes. These additional 20 batch runs were added to the existing training set and the network was retrained (the previously calculated network weights were used as the initial conditions for the retraining). The oldest 20 batch runs in the training set were deleted. The more recent data were given more weight by presenting them more often during the ANN training session. All the choices of parameters in this case study are rather arbitrary and one can conceivably fine tune these parameters. We did not investigate the sensitivity to these factors in this study. After a new ANN model is trained using the 20 new additional batch runs, the neurocontroller was turned on again and additional batches were run. Figure 16 shows that the changes in raw material properties have rendered the old neurocontroller incapable of maintaining the desired thickness. The control is improving after updating with 20 batch runs but still not satisfactory. It takes 40 new batch runs (10 with no controller, 30 with the updated neurocontroller in place) before the ANN model has become sufficiently accurate to control the process with the new raw materials. The fact that we used some closed-loop data for training has an impact also but is not clear from this study. Nevertheless, the results indicate that adaptive learning is feasible and can bring the process back under control when the processing characteristics change drastically.

With No Updates

Updated with 20 Records M" *.I.

0

1

2

3

I 5 I

7 6

I

1 101112

Batch Numbar

Updated with 30 Records 1N.h"

11.)

Updated with 40 Records Tm.,n"

mn,

1.1

l.{,.;;;;::,s,I

1.d

.,........ 1.7,:

Discussion. Adaptive learning is a fertile area for further study. There are many unanswered questions. We were surprised that only 10additional batch runs with no control along with the closed-loop runs were sufficient to retrain the network. Perhaps the nonlinearity in the process worked to our advantage here. It would be important to establish some criteria for determining the forgetting profile. Also some guidelines need to be established for the updating algorithm used. Conclusion Artificial neural networks offer a convenient platform for capturing the nonlinearity and dynamics inherent in batch processing models. In conjunction with nonlinear model predictive control, such ANN models can be effectively used for on-line feedback control of product quality. The combination of ANN models, a shrinking horizon model predictive control algorithm, and incremental learning strategies offer a convenient paradigm for imitating the action of skilled operators who learn to control a batch process from past operational experience. The proposed architecture of an on-line neurocontroller is quite effective in controlling the thickness and void size of product laminates in a simulated autoclave curing process. Through incremental updating of the neural network, the neurocontroller can adapt to wide variations in the processing characteristics. The methodology presented should be of interest to practicing engineers interested in improving the batch to batch variations of product quality. The bottleneck in the deployment of such neurocontrollers is the process identification step. Process identification requires operational data spanning the region of interest and should be as complete as possible (meaning that sufficient random variations in all key input variables of consequence should be incorporated in the identification procedure). If it is difficult to generate such data using the process, then part of the data may be generated from processing simulation models if available.

Ind. Eng. Chem. Res., Vol. 32, No. 9, 1993 1961

Acknowledgment This work was performed with partial support provided b y Air Force subcontract F33615-88-C-5455from Wright Patterson Air Force Base through McDonnell-Douglas and t h r o u g h an NSF Grant DDM-9123861. Nomenclature d = measurable external disturbances entering the system H = abstract representation of the artificial neural network k h p = fraction of impurity present in prepreg Pr = pressure applied to the autoclave Q = product quality measurements t = time TI = first holding temperature Tlsnd=

temperature at top of laminate at end of phase 1 TZ= second holding temperature = maximum temperature measured at top of laminate T? uring phase 2 tf = end-time of batch th = laminate product thickness tPl = time at which pressure is applied u = manipulated input to the system V, = maximum void diameter wt = weight fraction of resin in prepreg x = state of the system x g = initial state of the system y = secondary measurement on the system Greek Letters

fl = forgetting profile X = forgetting factor

Literature Cited Anderson,C. Learningto Control an Inverted Pendulum Using Neural Networks. ZEEE Control Syst. Mag. 1989,9 (3), 31-33. Barto, A. G.; Sutton, R.S.; Anderson, C. W. Neuron-like Adaptive Elements that can Solve Difficult Learning Problems. ZEEE Trans., S y s t e m , Man Cyb. 1983, SMC-13 ( 5 ) , 834-837. Bhat, N. V.; McAvoy, T. Modeling Chemical Process Systems via Neural Computation. ZEEE Control Syst. Mag. 1990, Apr, 2429. Campbell, F. C. Computer aided Curing of Composites. Internal Report No. AFWAL-TR-86-4060,Machine Research Laboratory, Wright Patterson, AFB, Dayton, OH. Dickinson, B. W. Group Behavior Modela for Learning in Neural Networks. Proc. 28th Conf. Decision Control, 1989, Tampa, FL. Hoskins, J. C.; Himmelblau, D. M. Artiicial Neural Network Models of KnowledgeRepresentation in ChemicalEngineering. Comput. Chem., Eng. J. 1989,35,881-885. Huang, S. C.; Huang, Y. F. Learning Algorithms for Perceptrons Using Backpropagation with Selective Updates. ZEEE Control Syst. Mag. 1990, 10 56-61. Hunt, K. J.; Sbarbaro, D.; Zbikowski, R.;Gawthrop, P. J. Neural Networks for Control Systems- A Survey. Autornatica 1993,28 (a), 1083-1112. Joseph, B.; Jang, S. S.; Mukai, H. Control of Constrained Multivariable Nonlinear Processes Using a Two-Phase Approach. Znd. Eng. Chem. Process Des. Dev. 1986,25,809-815.

Joseph, B.; Jang, S. S.; Mukai, H. On-lineOptimization of Constrained Multivariable Non-linear Processes. AIChE J. 1987,33, 26-30. Klein, R. A. Achieve Total Quality Management. Chem.Eng. h o g . 1991,87 ( l l ) , 83-86. LeClair, S.; Lagnese, T.; Abrams, F. Qualitative Process Automation for Autoclave Curing of Composites. Report No. AFWAL-TR87-4083; Wright Patterson Air Force Baae, Dayton, OH, 1987. Leonard, J. A.; Kramer, M. Improvement of the Backpropagation Algorithm for Training Neural Networks. Comput. Chem. Eng. J. 1990,14, 337-341. Levinson, W. A. Make the Most of Control Charta. Chem. Eng. Prog. 1992, 88 (3), 86-91. Ljung, L. System Identification: Theory for the User; PrenticeHall: Englewood Cliffs, NJ, 1987. Loos, A. C.; Springer, G. S. Curing of Epoxy Matrix Composites. J. Compos. Mater. 1983,17,135. Pao, Y. Adaptive Pattern Recognition and Neural Networks; Addison-Wesley: Reading, MA, 1989. Prett, D. M.; Garcia, C. E. Fundamental Process Control; Butterworth Boston, 1986. Psichogios, D.; Unger, L. Direct and Indirect Model Baaed Control Using Artificial Neural Networks. Znd. Eng. Chem.Res. 1991,30, 2564-2573. Rumelhart, D. E.; McClelland, J. L.; the PDP Research Group; Parallel Distributed Processing- Explorations in the Microstructure of Cognition. Vol. 1: Foundations; The MIT Press: Cambridge, MA, 1986. Shieh, D. Knowledge Acquisition From Routine Data, DSc. Thesis, Washington University, St. Louis, MO, 1991. Shieh, D.; Wang, F.; Joseph, B. Exploratory Data Aualysk A Comparison of Statistical Methods with Artificial Neural Networks. Comput. Chem. Eng. J. 1992,16 (4), 413-423. Stevenson, M.; Winter, R.;Widow, B. Sensitivity of Feed forward Neural Networks to Weight Errors. ZEEE Trans. Neural Networks 1990,1, 1. Venkataaubramanian, V.; Chan, K. A Neural Network Methodology for Process Fault Diagnosis. AIChE J. 1989,35, 1993-2002. W e r h , P. J. Beyond Regression: New Took for Prediction and Analysis in the Behavioral Sciences, Ph.D. Thesis, Harvard University, Cambridge, MA, 1974. Werbos, P. J. Neurocontrol and Supervised Learning: An overview andEvaluation. TheHandbook of Intelligent Control; White,D. A., Sofge, D. A., Eds.; Van Nostrand Reinhold New York, 1992. White, D. A.; Sofge, D. A. The Handbook of Intelligent Control; Van Nostrand Reinhold, New York, 1992. Whitely,J. R.;Davis,J. Application of Neural Networks to Qualitative Interpretation of Process Sensor Data. AIChE Spring Meeting, 1990, Orlando, FL. Wu, H. T. Knowledge Based Control of Composite Manufacturing Processes, DSc. Thesis, Washington University, St. Louis, MO, 1990. Wu, H. T.; Joseph B. Model Based Control of Autoclave Curing of Composites. SAMPE J. 1990, 26 (6), 39-54. Ydstie, B. E. Forecasting and Control Using Adaptive Connectionist Networks. Comput. Chem. Eng. 1990,14 (4), 583-599. Received for review February 11, 1993 Revised manuscript received June 1, 1993 Accepted June 9,1993