State Estimation for Integrated Moving Average Processes in High-Mix

Sep 3, 2013 - derivation considers an integrated moving average (IMA) disturbance, which ... have used integrated moving average to model or character...
0 downloads 0 Views 3MB Size
Article pubs.acs.org/IECR

State Estimation for Integrated Moving Average Processes in High-Mix Semiconductor Manufacturing Jin Wang,*,† Q. Peter He,‡ and Thomas F. Edgar§ †

Department of Chemical Engineering, Auburn University, Alabama 36849, United States Department of Chemical Engineering, Tuskegee University, Alabama 36088, United States § Department of Chemical Engineering, The University of Texas at Austin, Texas 78712, United States ‡

ABSTRACT: High-mix manufacturing in the semiconductor industry has driven the development of several nonthreaded state estimation methods. These methods share information among different manufacturing contexts and avoid data segregation that threaded methods require. However, existing nonthreaded methods consider either white noise disturbance or integrated white noise disturbance. In this work, we derive the state-space representation of the nonthreaded state estimation problem. The derivation considers an integrated moving average (IMA) disturbance, which is more realistic than white noise or integrated white noise disturbance for most semiconductor processes. In addition, the derivation considers the fact that if a context item is not involved in a process run, then its state does not change. Finally, we propose an improved nonthreaded state estimation method based on the Kalman filter. Simulations examples are given to demonstrate the performance of the proposed method, which is also compared with the existing Kalman filter approach, which considers integrated white noise only. A practical modification of the improved Kalman filter is also proposed to significantly simplify the implementation while providing comparable state estimation performance.

1. INTRODUCTION Currently, the most commonly applied control technique in the semiconductor industry is the run-to-run feedback control. The basic structure of a run-to-run controller consists of a process model, a state estimator, and a control law. Because linear process models and the deadbeat control law are applied in most applications, the choice of the state estimation method can have a significant impact on the control performance. As there are many sources of process variations, such as tools, devices, and layers, the direct measurement of each individual disturbance is usually not available. To cope with this difficulty, the traditional state estimation method utilizes a “threaded” approach where the historical data are segregated into different bins based on the specific manufacturing context, such as {tool B, layer I, and product β}; then the data from the runs that have the same manufacturing context as the current run are used to estimate the state for the current run. Threaded state estimation methods have achieved notable success, as they reduce sources of variation by partitioning historical data into different threads defined by manufacturing context. However, because many state-of-the-art fabs are operating with multiple process tools and increasingly diversified product mixes, the information segregation by threads can fail or severely degrade the performance of the threaded state estimation. This is because a narrowly defined process stream can result in too many threads and insufficient data for some threads, which also requires a large number of “send-ahead” wafers for new threads. The issue becomes significantly more pronounced when applied to high-mix manufacturing, especially for lithography processes, where there are many sources of variations. To address this challenge, several nonthreaded state estimation methods have been reported in the past few years.1−7 The © XXXX American Chemical Society

general theme of these nonthreaded methods is the sharing of information among different contexts. Assuming that the interactions among different contexts are linear, different algorithms, such as linear regression5−7 and the Kalman filter,1,2,6 have been applied to estimate the contributions from different variation sources. Among these methods, process disturbance is modeled either as a white noise disturbance2,5,6 or as an integrated white noise disturbance.6,7 However, it has long been recognized that an integrated moving average (IMA) model provides much better approximation of the disturbances in industrial processes.8 Many researchers that studied industrial processes, including in the semiconductor9−16 and other industries,17−22 have used integrated moving average to model or characterize process disturbances, which indicates that IMA describes many industrial disturbances well. For semiconductor processes, it is well know that EWMA control is an optimal controller for IMA disturbance. The global success of a threaded EWMA controller in the semiconductor manufacturing industry confirms that IMA disturbance describes most semiconductor process disturbances well. In addition, the authors’ experiences in the semiconductor industry show that IMA describes most semiconductor process data much better than white noise or integrated white noise sequences. Therefore, in this work we will extend our previously developed general framework6 to cover the IMA disturbance. It is worth noting that the interactions among different contexts could be nonlinear, as indicated Special Issue: David Himmelblau and Gary Powers Memorial Received: May 15, 2013 Revised: August 30, 2013 Accepted: September 3, 2013

A

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

by the multiscale models of some semiconductor processes.23−26 However, for control applications, existing research and industrial practice show that linear approximations are often necessary and sufficient to capture the key contributions for manufacturing processes.27−33 This is due to the fact that the processes are often tightly controlled around their operation points and the fact that the manipulated variables available for real-time feedback control operate at the macroscopic length scale (e.g., the power to heat lamps above a wafer).23

the controller because the controller uses the overall observed state, which is a linear combination of the individual states, to compute the next control move. We have shown that the estimate of the overall observed state is unique.6 In ref 6 a general framework was developed on the basis of the best linear unbiased estimate (BLUE) of a Gauss−Markov model. Using this general framework, three groups of nonthreaded state estimation methods are unified. In this subsection, we briefly review the relevant results. The state estimation problem formulated in eq 3 can be formulated as a parameter updating problem as the following,

2. BRIEF REVIEW OF THE EXISTING WORK In this section, we first present the problem formulation for nonthreaded state estimation; then we briefly review the general framework, together with three groups of representative nonthreaded state estimation methods. 2.1. Problem Formulation. In this work, the overall process state is described below by the superposition of all context-based states contributing to the variation for a particular run k plus the mean effect, zk = μ +

∑ si ,k + vk i

⎡ zk ⎤ ⎡ Hk ⎤ ⎡ vk ⎤ ⎢ ⎥ = ⎢ ⎥x ̂ k + 1 + ⎢ e ⎥ ⎣ x ,k ⎦ ⎣ x̂ k ⎦ ⎣ I ⎦

with ⎧⎡ vk ⎤⎫ E⎨⎢ ⎥⎬ = 0 ⎩⎣ ex , k ⎦⎭ T⎫ ⎧ ⎪⎡ vk ⎤⎡ vk ⎤ ⎪ ⎡R 0 ⎤ ⎬=⎢ E⎨ ⎢ ⎥ ⎢ ⎥ ⎥ ⎪ e ⎪ e ⎩⎣ x , k ⎦⎣ x , k ⎦ ⎭ ⎣ 0 P ⎦

(6)

where x̂k is the estimate of the process state at the beginning of run k and x̂k+1 is the updated estimate of the process state after the measurement of run k (i.e., zk) becomes available. In the case of nonthreaded state estimation, the output disturbance vk is a scalar; therefore, we use R instead of R to denote its variance. The second block of eq 4, x̂k = x̂k+1 + ex,k, which can be viewed as x̂k+1 = x̂k − ex,k, provides the information obtained from the previous estimation step. On the basis of eq 4, the BLUE of xk+1 is the following,

(2)

In this way, the overall state is broken down into the mean effect and the individual components corresponding to the context items. In this work, we consider a process with c categories of context items, and each category has ni (i = 1, 2, ..., c) elements. N runs (i.e., N measurements) would lead to a set of N linear equations shown below ⎡ μ⎤ z = [1 C ]⎢ ⎥ + v = Hx + v ⎣s⎦

(5)

and (1)

where zk represents the overall observed state; vk is the process disturbance; μ represents the mean effect; si represents the state of an individual context item i, and the summation is over the specific context items for run k. For example, if a thread definition includes three categories, {tool, layer, product}, then the overall state for a specific thread, e.g., {tool A, layer III, product γ }, is modeled as z = μ + stoolA + slayerIII + sproductγ + v

(4)

x̂k + 1 = x̂k + Pk HkT(R + Hk Pk HkT )−1(zk − Hk x̂k )

(7)

Pk + 1 = Pk − Pk HkT(R + Hk Pk HkT )−1Hk Pk

(8)

where Pk and Pk+1 are the estimate error covariances for xk and xk+1, respectively. It follows explicitly from eq 8 that Pk+1 ≤ Pk. In other words, the accuracy of x̂k+1 is better than that of x̂k because of the additional information carried by zk. When the number of measurements k goes to infinity, Pk will approach the steady state P∞, the norm of which is usually small. 2.3. Three Groups of Nonthreaded State Estimation Methods. In this subsection, we briefly review three groups of nonthreaded state estimation methods, which can be unified with the general framework. 2.3.1. Kalman Filter. The Kalman filter has been applied to solve the nonthreaded state estimation problem.1,2,6,7 We reorganize eq 3 into the following state-space representation, xk+1 = xk (9)

(3)

where 1 = [1 1 ... ]T is an N × 1 vector of ones, H= [1 C] is the augmented context matrix with dimension N × (Σi c= 1ni + 1), and x = [μ s]T is the augmented state vector with dimension (Σic= 1ni + 1) × 1. s is a Σic= 1ni × 1 vector of context-based state, z and v are N × 1 vectors of the observed overall states and unmeasured disturbances for the N runs. The matrix C is an N × Σic= 1ni matrix of ones and zeros. Each row of C represents the context information for a single run, where ones correspond to the context items (e.g., a specific tool) used in the run, and zeros are assigned to the context items not appearing in the run. 2.2. General Framework. In our previous work,6 we discussed the following two characteristics that are inherent to the nonthreaded state estimation problem: (1) the regressor matrix H is singular due to rank deficiency, i.e., rank(H) = rank(C) ≤ (∑i c= 1ni − c + 1, and (2) the parameter estimation problem is poorly excited due to the fact that the total number of threads is far less than the full combination of different context elements. Both characteristics have to be addressed in order to obtain satisfactory state estimation performance. The rank deficiency problem will result in nonunique estimates of the individual states; however, this does not pose a problem for

zk = Hk x k + vk

(10)

which is a Gauss−Markov model and only considers white noise disturbance (i.e., constant states). It is straightforward to verify that the estimate of xk by the Kalman filter based on the above model is exactly the same as eqs 7 and 8. Equations 9 and 10 can be easily expanded to cover the case with integrated white noise as the state disturbance, i.e.,

x k + 1 = x k + wk B

(11)

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research zk = Hk x k + vk

Article

while ω and αi (i = 1, 2, ..., n) are the weightings assigned to the current and previous measurements. In ref 6 it was shown that the JADE algorithm is equivalent to BLUE if and only if the most recent measurement is included (m = 1) and

(12)

and the corresponding update for the P becomes Pk + 1 = Pk − Pk HkT(R + Hk Pk HkT )−1Hk Pk + Q

(13)

where wk is a white noise sequence with the covariance matrix Q. It is worth noting that the Kalman filter based on the integrated white noise model (i.e., eqs 7 and 13) has been used by several researchers to perform nonthreaded state estimation.1,2,6,7 2.3.2. Recursive Least-Squares. In our previous work,6 recursive least-squares (RLS) has been applied to update the estimate of the process state. The RLS estimate based on eq 3 is the following, x̂k + 1 = x̂k + Pk HkT(Hk Pk HkT + 1)−1(zk − Hk x̂k )

(14)

Pk + 1 = Pk − Pk HkT(Hk Pk HkT + 1)−1Hk Pk

(15)

(23)

P = Q −4 1

(24)

3. IMPROVED STATE ESTIMATION METHOD From the last section we see that the existing methods consider either white noise disturbance (e.g., eq 3) or integrated white noise state disturbance (e.g., eq 11). However, as discussed earlier, many industrial semiconductor process data show that it would be more realistic to approximate state disturbances with integrated moving average (IMA) models,9−16 i.e., xi , k + 1 = xi , k − λwi , k + wi , k + 1

It is clear that if we set E[vvT ] = R = 1

⎡z ⎤ ⎡H⎤ ⎡v ⎤ ⎢ x̂ ⎥ = ⎢⎣ ⎥⎦x̂k + 1 + ⎢⎣ ex , k ⎥⎦ ⎣ k⎦ I

wi , k + 1 = wi , k

where z represents the m most recent measurements and H is the context matrix corresponding to z. Note that if m > 1, then previous measurements are used more than once because their contributions have already been included in the previous estimate x̂k. In order to assign different preferences to the current measurements versus the previous state estimate, a weighting matrix Q is included in the objective function to be minimized with leastsquares,

Then the state-space representation of the process can be formulated as the following:

⎞T ⎛⎡ z ⎤ ⎡ H ⎤ ⎞ 1 ⎛⎡ z ⎤ ⎡ H ⎤ J(x̂k + 1) = ⎜⎢ ⎥ − ⎢ ⎥x̂k + 1⎟ Q⎜⎢ ⎥ − ⎢ ⎥x̂k + 1⎟ x x ̂ ̂ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ I I 2⎝ k ⎠ ⎝ k ⎠ (18) 5

where (20)

Q 2 = Q3 = 0

(21)

⎡ α1(1 − ω) 0 ⎤ ··· 0 ⎢ ⎥ ⎢0 ⎥ ⋮ α2(1 − ω) Q4 = ⎢ ⎥ ⋱ 0 ⎢⋮ ⎥ ⎢ ⎥ ··· α − ω 0 0 (1 ) ⎣ ⎦ n

⎡ I ( −λ) ·diag(Hk)⎤ ⎡ diag(Hk)⎤ ⎥θk + ⎢ ⎥ek θk + 1 = ⎢ ⎢⎣ 0 I − diag(Hk) ⎥⎦ ⎢⎣ diag(Hk)⎥⎦

(29)

zk = [diag(Hk) 0 ]θk + vk

(30)

where Hk is the row of the context matrix H that corresponds to run k, and ek and vk are white noise sequences with covariances Q and R, respectively. Next we verify that eqs 29 and 30 generate process states with IMA disturbance as shown in eqs 25−27. Plugging eq 28 in eq 29, we have

(19)

Q 1 = ωI

(27)

In this section, we present an improved nonthreaded state estimation method by incorporating the IMA disturbance model and the consideration of unchanged state (i.e., eqs 25−27) into the state estimation problem formulation. In the following, we first derive the state-space representation of the nonthreaded state estimation problem and then apply the Kalman filter to obtain the estimate of the process state vector. In order to derive the state-space representation that reproduces eqs 25−27, we define the extended process state θk as the augmentation of the process state xk with the white noise wk that drives the IMA disturbance, ⎡ xk ⎤ θk = ⎢ ⎥ ⎣ wk ⎦ (28)

(17)

⎡Q 1 Q 2 ⎤ ⎥ Q=⎢ ⎢⎣Q 3 Q 4 ⎥⎦

(25)

where wi,k is a white noise sequence and λ is the moving average parameter. In addition, we should consider the fact that if a context item is not involved in run k, its state should not change either, i.e., xi , k + 1 = xi , k (26)

(16)

Equations 14 and 15 are equivalent to eqs 7 and 8 in the general framework. 2.3.3. Moving Window Least-Squares. In the moving window least-squares approach, the parameter estimation problem, eq 3, is solved after a batch of measurements become available. Justin-time adaptive disturbance estimation (JADE)5 is the representative method for this group, which resolves the following estimation problem using weighted least-squares.

and Q is composed of the following submatrices:

R = Q 1−1

x k + 1 = x k − λ ·diag(Hk)wk + diag(Hk)ek

(31)

wk + 1 = [I − diag(Hk)]wk + diag(Hk)ek

(32)

For a context item i that shows up in run k, the corresponding element in Hk is 1; then eqs 31 and 32 reduce to xi , k + 1 = xi , k − λwi , k + ei , k

(22) C

(33)

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

wi , k + 1 = ei , k

Pk + 1 = A k(Pk − Pk CTk (R + Ck Pk CTk )−1Ck Pk )ATk + Q k

(34)

Plugging eq 34 into eq 33, we have

(42)

xi , k + 1 = xi , k − λwi , k + wi , k + 1

[Remark 1] Equations 29 and 30 represent a data-dependent system, which is neither time-invariant nor time-varying. It has been shown that for such a case the Kalman filter provides the optimal estimate of the system state vector if the true output variance R is known.34,35 [Remark 2] Because the context matrix H is rank deficient, the system defined by eqs 29 and 30 is not observable. Therefore, there is no guarantee that unbiased estimates for each individual state can be obtained. However, Werner36 showed that, for certain linear combinations of the individual states, unbiased estimates can be obtained, which can be used to assess the performance of the state estimation methods. [Remark 3] In order to implement the solution given in eqs 41 and 42, we need the information on the IMA disturbance, i.e., λ, which is usually unknown. However, as we show in the next section, the state estimation is not sensitive to the estimate of λ, which allows easy tuning of the Kalman filter. It is worth noting that λ is the moving average parameter of the IMA model, which plays a key role in determining the dynamics of the disturbance. In other words, different λ’s will result in totally different disturbance time series for the same driving while noise sequence. The state estimation algorithm is not sensitive to λ̂ (i.e., the estimate of λ), which means that the estimator is robust, and its performance is not sensitive to the accuracy of λ̂. This property is highly desirable, as λ (or other disturbance model parameters) is usually not known. Of course, the performance of the estimator is optimal when λ̂ = λ.

(35)

which is exactly eq 25. Next, for a context item i that does not show up in run k, the corresponding element in Hk is 0, and eqs 31 and 32 reduce to xi , k + 1 = xi , k (36) wi , k + 1 = wi , k

(37)

which are exactly eqs 26 and 27. Finally, based on eqs 29 and 30, it is straightforward to derive the estimate of the process state using the Kalman filter. By denoting ⎡ I ( −λ) ·diag(Hk)⎤ ⎥ Ak = ⎢ ⎢⎣ 0 I − diag(Hk) ⎥⎦

(38)

Ck = [diag(Hk) 0 ]

(39)

⎡ diag(Hk)⎤ ⎡ diag(Hk)⎤T ⎥Q⎢ ⎥ Qk = ⎢ ⎢⎣ diag(Hk)⎥⎦ ⎢⎣ diag(Hk)⎥⎦

(40)

the state estimation by the Kalman filter is the following: x̂k + 1 = A kx̂k + A kPk CTk (R + Ck Pk CTk )−1(zk − Ck x̂k ) (41)

Table 1. Simulation Set Up context

frequency

λ

T1 T2 L1 L2 P1 P2 P3

0.5 0.5 0.9 0.1 0.4 0.5 0.1

0.75 0.85 0.60 0.50 0.70 0.80 0.90

Var(w) 1.6 1.6 2.5 2.5

× 10−3 × 10−3 × 10−3 × 10−3 0.04 0.04 0.04

4. SIMULATION EXAMPLES In this section we use simulation examples to demonstrate the performance of the improved state estimation method, which is denoted as Kalman-IMA, and to investigate its properties. In addition, we compare the performance of Kalman-IMA with the existing Kalman filter approach that considers an integrated white noise state disturbance (i.e., eqs 7 and 13), which we denote as Kalman-IW.

initial value 0.2 0.5 1.1 0.7 2.3 1.2 2.4

Figure 1. Data used in case study I. D

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 2. Estimation results from the Kalman-IW for case study I. For y1 ∼ y5, the solid blue lines are actual values while the red dots are estimated values.

Figure 3. Estimation results from the Kalman-IMA for case study I. For y1 ∼ y5, the solid blue lines are actual values while the red dots are estimated values.

Table 2. Mean Squared Prediction Error

In this section we consider the following simple example, where the process context consists of three categories {tool, layer, product}, with ntool = 2, nlayer = 2, and nproduct = 3. We use T1, T2 to denote the two tools, L1, L2 to denote the two layers, and P1, P2, P3 to denote the three products. Three case studies are presented below to provide a comprehensive comparison of the two methods, as well as to investigate their properties. 4.1. Case Study I. In this case study, the state of each individual context item is generated following eq 25; that is, each individual state follows an IMA model. In addition, measurement noise (with variance 0.04) is added to the measured overall state. The context realization for each run is randomly selected based on a given probability of occurrence

method

case study I

case study II

case study III

Kalman-IW Kalman-IMA Kalman-IW (P reset)

0.088 0.065 0.071

0.38 0.14 0.17

1.46 × 10−3 1.16 × 10−3 1.22 × 10−3

for the available choices in each of the three categories of the context. The running frequency of different context items, as well as their associated model parameters, are listed in Table 1. The simulated overall state and each individual state are plotted in Figure 1. In this work, in order to evaluate the performance of the two state estimation methods, we consider the following set of the E

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 4. Data used in case study II.

Figure 5. Estimation results from the Kalman-IW for case study II. For y1 ∼ y5, the solid blue lines are actual values while the red dots are estimated values.

mean squared prediction errors of both methods are listed in Table 2. For Kalman-IW, because the process model does not describe the real state disturbance, the estimation performance is not satisfactory: the estimated state tracks the reference thread (i.e., y1) reasonably well but does poorly in tracking the other difference states (i.e., y2 to y5). For Kalman-IMA, by explicitly considering the IMA disturbance model structure, it performs significantly better in tracking both the reference thread and the difference states, despite the fact that a fixed λ (0.7) is used for all context items. 4.2. Case Study II. In this case study, the state of each individual context item is corrupted by an integrated white noise disturbance. Notice that integrated white noise is a special case of IMA with λ = 0. Figure 4 plots the simulated process data. Again, we apply both Kalman-IW and Kalman-IMA to the simulated data set, and results are given in Figures 5 and 6, as well as Table 2.

transformed states (denoted by y) whose unbiased estimate could be obtained. y1 = x P1 + xT1 + x L1 (43)

y2 = x P2 − x P1

(44)

y3 = x P3 − x P1

(45)

y4 = xT2 − xT1

(46)

y5 = x L2 − x L1

(47)

where y1 can be viewed as the state of the reference thread, and y2 ∼ y5 are the differences between each element and its reference in the same category. The estimates of the transformed states and the prediction error of the overall state obtained from Kalman-IW and Kalman-IMA are plotted in Figures 2 and 3, respectively. In addition, the F

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 6. Estimation results from the Kalman-IMA for case study II. For y1 ∼ y5, the solid blue lines are actual values while the red dots are estimated values.

Figure 7. Estimation results from the modified Kalman-IW for case study II. For y1 ∼ y5, the solid blue lines are actual values while the red dots are estimated values.

To our surprise, the Kalman-IW method (even with true R and Q) performs substantially worse than the Kalman-IMA method with λ = 0.7. This result is counterintuitive because in this case the individual state follows an integrated white noise process. Therefore, with the true noise variance information, we expect Kalman-IW to perform better than Kalman-IMA. Further investigation reveals that eqs 11 and 12 do not provide an accurate description of the process. This is because in eq 11 the individual state evolves no matter if it is involved with a run or not. But in real processes and in this simulation, if a context item is not involved in a process run, its state does not change. Therefore, we propose a more accurate description of the process with integrated white noise state disturbance

⎤⎡ x k ⎤ ⎡ diag(Hk)⎤ ⎡ xk+1 ⎤ ⎡ I 0 ⎥e ⎥⎢ ⎥ + ⎢ ⎢w ⎥ = ⎢ ⎣ k + 1⎦ ⎣ 0 I − diag(Hk)⎦⎣ wk ⎦ ⎣⎢ diag(Hk)⎥⎦ k

(48)

⎡ xk ⎤ zk = [diag(Hk) 0 ]⎢ ⎥ + vk ⎣ wk ⎦

(49)

We implemented the modified Kalman-IW based eqs 48 and 49, which provides a significantly improved result, as shown in Figure 7. The MSPE is improved to 0.09, better than Kalman-IMA, as expected. 4.3. Case Study III. In this case study we test the robustness of both Kalman-IMA and Kalman-IW by simulating the individual states in the following way: the product states are G

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 8. Data used in case study III.

Figure 9. Estimation results from the Kalman-IW for case study III. For y1 ∼ y5, the solid blue lines are actual values while the red dots are estimated values.

and 49 must be considered. That is, if a context item is not involved in a process run, its state does not evolve. However, the price one pays for the improved state estimation by considering such a fact is a much more cumbersome implementation of the Kalman filter, i.e. eqs 38−42. Such a burden becomes severe when new products are frequently introduced and old products frequently retire. In this section, we introduce a simplified approach to address this limitation by periodically resetting the covariance P to P0. The basic idea that motivates this simplification is the following: if we prevent P from converging to its steady-state value, which is usually very small, then the Kalman filter would allow much bigger changes in the estimated states, which can compensate for the modeling error introduced by ignoring the important fact. We implemented such simplification for the modified Kalman-IW algorithm and tested it on both case study II and III, with results given in

kept as constants, the tool states involve both deterministic drift and step changes, and one of the layer states includes a rectangular pulse disturbance. This is the simulation example used in ref 6. Figure 8 plots the simulated process data. Estimation results from Kalman-IW and Kalman-IMA are given in Figures 9 and 10, as well as in Table 2. In this case, the process data fit neither of the models used by Kalman-IW and Kalman-IMA. But Kalman-IMA does a better job in tracking the difference states, which further illustrates the robustness of the Kalman-IMA method.

5. SIMPLIFIED VERSION FOR PRACTICAL APPLICATIONS From the previous two sections, we can see that, in order to obtain an accurate state estimation, the fact captured by eqs 48 H

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 10. Estimation results from the Kalman-IMA for case study III. For y1 ∼ y5, the solid blue lines are actual values while the red dots are estimated values.

Figure 11. Estimation results from the modified Kalman-IW with simplified implementation for case study II. For y1 ∼ y5, the solid blue lines are actual values while the red dots are estimated values.

As long as P0 is reasonably large, the estimates are not affected noticeably, because the really large P0 would drop significantly with the first few measurements. It is worth noting that the numbers in Table 2 are for the specific realizations shown in the figures. It can be seen from the figures that the improvements on the estimates of some individual transformed states (e.g., y2 to y5 for case study I) are quite significant, while the improvement on the estimated overall state (with prediction errors shown in subplot “Pred Err”) is not that significant. The main reason is that sharing information properly would benefit “low runners” (i.e., the context items that are used less frequently) the most; while the “high runners” would not see significant improvements, as their

Figures 11 and 12, as well as in Table 2. In addition, we plotted a few elements of the covariance matrix to show the effect of the proposed approach, which is given in Figure 13. By comparing Figures 11 and 12 with Figures 5 and 9, we can see that with periodic reset of the covariance matrix P, the state estimation performance of the Kalman-IW is significantly improved, and we still maintain the simple implementation and the flexibility to easily remove a context item for the modified Kalman-IW method. It is worth noting that the reset frequency depends on the process characteristics. If the process is quite volatile, P needs to be reset more frequently. Usually tuning based on historical data should suffice. We also want to note that the initial choice of covariance (i.e., P0) is not critical. I

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research

Article

Figure 12. Estimation results from the modified Kalman-IW with simplified implementation for case study III. For y1 ∼ y5, the solid blue lines are actual values while the red dots are estimated values.

utilized in a given run; and the Kalman filter is applied to estimate the state for the nonthreaded state estimation problem. Several case studies are used to compare the performance of the proposed Kalman-IMA method with the previously developed Kalman-IW method. The case studies confirm that the proposed Kalman-IMA is robustit is not sensitive to the tuning parameter (i.e., the estimate of the disturbance parameter), and it provides satisfactory performance for nonIMA processes as well. In addition, it was found that by modifying the existing Kalman-IW to capture the fact that a state does not evolve if the corresponding context item is not utilized in a given run, its performance can be significantly improved as well. However, such representations (for both Kalman-IMA and modified Kalman-IW) may present some difficulties in practical applications, as they may not be as straightforward as the Kalman-IW method. To address this, we propose a simplified version of the modified Kalman-IW method. Instead of keeping track of whether a context item is utilized in a run, we simply reset the covariance matrix P in the Kalman filter periodically. Our case studies show that this simplified method can achieve similar performance as the Kalman-IMA while maintaining its simple implementation and flexibility, which is highly desirable in industrial applications.

Figure 13. Effect of periodic reset on the covariance matrix for the modified Kalman-IW with simplified implementation for case study III.

estimates are much better than low runners due to frequent feedback. This point can be seen by comparing subplot “y1” in Figures 2 and 3. Therefore, although the improvements on y2 ∼ y5 are quite significant, due to their low running frequencies, their contributions to the overall state are somewhat lower. But still, if we calculate the percentage improvement, the improvement is in the range of 21%−63%, and this magnitude of improvement is typical for different realizations.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Phone:(334) 844-2020. Fax: (334)844-2063. Notes

The authors declare no competing financial interest.



6. SUMMARY AND CONCLUSIONS In this work, our previously developed general framework for nonthreaded state estimation is expanded to cover the case of integrated moving average processes. In this work, the derived state-space representation of the process captures the fact that a state does not evolve if the corresponding context item is not

ACKNOWLEDGMENTS

Financial support from the NSF is gratefully acknowledged by J.W. under Grant CBET-0853983, Q.P.H. under Grant CBET0853748, and T.F.E. under Grant CBET-0854033. J

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX

Industrial & Engineering Chemistry Research



Article

Multiscale Systems Approach to Microelectronic Processes. Comput. Chem. Eng. 2006, 30, 1643−1656. (24) Zhao, Y.; Jiang, C.; Yang, A. Towards Computer-Aided Multiscale Modelling: An Overarching Methodology and Support of Conceptual Modelling. Comput. Chem. Eng. 2012, 36, 10−21. (25) Ricardez-Sandoval, L. A. Current Challenges in the Design and Control of Multiscale Systems. Can. J. Chem. Eng. 2011, 89, 1324− 1341. (26) Jaworski, Z.; Zakrzewska, B. Towards Multiscale Modelling in Product Engineering. Comput. Chem. Eng. 2011, 35, 434−445. (27) Qin, S. J.; Cherry, G.; Good, R.; Wang, J.; Harrison, C. A. Semiconductor Manufacturing Process Control and Monitoring: A Fab-Wide Framework. J. Process Control 2006, 16, 179−191. (28) Rashap, B. A.; Elta, M. E.; Etemad, H.; Fournier, J. P.; Freudenberg, J. S.; Giles, M. D.; Grizzle, J. W.; Kabamba, P. T.; Khargonekar, P. P.; Lafortune, S.; Moyne, J. R.; Teneketzis, D.; Terry, F. L. Control of Semiconductor Manufacturing Equipment: Real-Time Feedback Control of a Reactive Ion Etcher. IEEE Trans. Semicond. Manuf. 1995, 8, 286−297. (29) Rossnagel, S. M. Sputter Deposition for Semiconductor Manufacturing. IBM J. Res. Dev. 1999, 43, 163−179. (30) May, G. S.; Spanos, C. J. Fundamentals of Semiconductor Manufacturing and Process Control; Wiley: 2006 (31) Pasadyn, A. J.; Edgar, T. F. Observability and State Estimation for Multiple Product Control in Semiconductor Manufacturing. IEEE Trans. Semicond. Manuf. 2005, 18, 592−604. (32) Ning, Z.; Moyne, J. R.; Smith, T.; Boning, D.; Del Castillo, E.; Yeh, J.-Y.; Hurwitz, A. A Comparative Analysis of Run-to-Run Control Algorithms in the Semiconductor Manufacturing Industry. Adv. Semicond. Manuf. Conf. Workshop, 1996; ASMC 96 Proceedings; IEEE/SEMI 1996; pp 375−381. (33) Urbach, J. Physical Model for the Small-Scale Residual Topography in Chemical Mechanical Polishing. Semicond. Manuf., IEEE Trans. 2011, 24, 559−565. (34) Bohlin, T. Information Pattern for Linear Discrete-Time Models with Stochastic Coefficients. IEEE Trans. Autom. Control 1970, AC-15, 104−106. (35) Astrom, K.; Wittenmark, B. Problems of Identification and Control. J. Math. Anal. Appl. 1971, 34, 90−113. (36) Werner, H. More on BLU Estimation in Regression Models With Possibly Singular Covariances. Linear Algebra Appl. 1985, 67, 207−214.

REFERENCES

(1) Pasadyn, A.; Edgar, T. Observability and State Estimation for Multiple Product Control in Semiconductor Manufacturing. IEEE Trans. Semicond. Manuf. 2005, 18, 592−604. (2) Hanish, C. Run-to-Run State Estimation in Systems with Unobservable States. Proc. AEC/APC Symp. XVII, Indian Wells, CA, 2005. (3) Mos, E.; Wang, V.; Kisteman, A.; Verstappen, L.; Megens, H. Generalized Data-Sharing and Frequency Domain Optimization of Overlay Run-to-Run Feedback Control. The 6th European AEC/APC Symposium, Dublin, Ireland, 2005. (4) Levin, T.; Geier, I.; Zhivotovsky, A.; Aframiam, N.; FriedlanderKlar, H. Automated Process Control Optimization to Control Low Volume Products Based on High Volume Products Data. Proc. SPIE Data Anal. Model. Process Control II; 2005; Vol. 5755, pp 145−146. (5) Firth, S.; Campbell, W.; Toprac, A.; Edgar, T. Just-in-Time Adaptive Disturbance Estimation for Run-to-Run Control of Semiconductor Processes. IEEE Trans. Semicond. Manuf. 2006, 19, 298− 315. (6) Wang, J.; He, Q.; Edgar, T. State Estimation in High-Mix Semiconductor Manufacturing. J. Process Control 2009, 19, 443−456. (7) Prabhu, A. V.; Edgar, T. F. A New State Estimation Method for High-Mix Semiconductor Manufacturing Processes. J. Process Control 2009, 19, 1149−1161. (8) Box, G.; Jenkins, G.; MacGregor, J. Some Recent Advances in Forecasting and Control. Appl. Stat. 1974, 23, 158−179. (9) Tsung, F.; Shi, J. Integrated Design of Run-to-Run PID Controller and SPC Monitoring for Process Disturbance Rejection. IIE Trans. 1999, 31, 517−527. (10) Tsung, F.; Shi, J.; Wu, C. Joint Monitoring of PID-Controlled Processes. J. Qual. Technol. 1999, 31, 275−285. (11) Zheng, Y.; Lin, Q.-H.; Wang, D. S.-H.; Jang, S.-S.; Hui, K. Stability and Performance Analysis of Mixed Product Run-to-Run Control. J. Process Control 2006, 16, 431−443. (12) Chen, A.; Guo, R.-S. Age-Based Double EWMA Controller and Its Application to CMP Processes. IEEE Trans. Semicond. Manuf. 2001, 14, 11−19. (13) Fan, S.-K. S.; Jiang, B. C.; Jen, C.-H.; Wang, C.-C. SISO Run-toRun Feedback Controller Using Triple EWMA Smoothing for Semiconductor Manufacturing Processes. Int. J. Prod. Res. 2002, 40, 3093−3120. (14) Prabhu, A. V.; Edgar, T. F. A New State Estimation Method for High-Mix Semiconductor Manufacturing Processes. J. Process Control 2009, 19, 1149−1161. (15) Su, A.-J.; Yu, C.-C.; Ogunnaike, B. A. On the Interaction between Measurement Strategy and Control Performance in Semiconductor Manufacturing. J. Process Control 2008, 18, 266−276. (16) Good, R. P.; Qin, S. J. On the Stability of MIMO EWMA Runto-Run Controllers with Metrology Delay. IEEE Trans. Semicond. Manuf. 2006, 19, 78−86. (17) Apley, D. W.; Kim, J. Cautious Control of Industrial Process Variability with Uncertain Input and Disturbance Model Parameters. Technometrics 2004, 46, 188−199. (18) Box, G.; Kramer, T. Statistical Process Monitoring and Feedback Adjustmentůa Discussion. Technometrics 1992, 34, 251−267. (19) Chiu, C.-C.; Shao, Y. E.; Lee, T.-S.; Lee, K.-M. Identification of Process Disturbance Using SPC/EPC and Neural Networks. J. Intell. Manuf. 2003, 14, 379−388. (20) Duncan, S. R.; Corrscadden, K. Minimising the Range of CrossDirectional Variations in Basis Weight on a Paper Machine. Proc. 1996 IEEE Int. Conf. Control Appl. 1996, 149−154. (21) Kjaer, A.; Heath, W.; Wellstead, P. Identification of CrossDirectional Behaviour in Web Production: Techniques and Experience. Control Eng, Pract. 1995, 3, 21−29. (22) Zhang, M.; Gorinevsky, D.; Dumont, G. Tuning Feedback Controller of Paper Machine for Optimal Process Disturbance Rejection. Control Systems 98, Porvoo, Finland, Sept. 1998. (23) Braatz, R. D.; Alkire, R. C.; Seebauer, E. G.; Drews, T. O.; Rusli, E.; Karulkar, M.; Xue, F.; Qin, Y.; Jung, M. Y.; Gunawan, R. A K

dx.doi.org/10.1021/ie401537d | Ind. Eng. Chem. Res. XXXX, XXX, XXX−XXX