Optimal Control of Rapid Thermal Processing Systems by Empirical

Regularization Embedded Nonlinear Control Designs for Input-Constrained and Ill-Conditioned Thermal System. Journal of Dynamic Systems, Measurement, ...
0 downloads 0 Views 211KB Size
3964

Ind. Eng. Chem. Res. 1999, 38, 3964-3975

Optimal Control of Rapid Thermal Processing Systems by Empirical Reduction of Modes H. M. Park,* T. Y. Yoon, and O. Y. Kim Department of Chemical Engineering, Sogang University, Shinsoo-Dong, Mapo-Gu, Seoul, 121-742, Korea

In rapid thermal processing (RTP) of semiconductor wafers precise control of wafer temperature is required throughout the process cycle to minimize dopant redistribution as well as wafer warpage. A low-dimensional dynamic model of RTP of semiconductor wafers is derived by using the Karhunen-Loe`ve-Galerkin procedure (Park, H. M.; Cho, D. H. Chem. Eng. Sci. 1996, 51, 81) to implement efficiently the optimal control scheme which ensures the transient temperature uniformity in RTP systems. The optimal control scheme employing the low-dimensional dynamic model is compared with the conventional method using the original partial differential equation and is found to be very efficient without losing accuracy. Introduction In the manufacturing of integrated circuits, it is necessary to keep the temperature low to minimize the redistribution of dopants. But some processes, such as implant annealing, are not as effective at low temperature. Certain type of implant damage cannot be annealed out unless high temperatures are achieved. One possible avenue to minimizing diffusion during annealing at high temperature is to reduce the process time at high temperature. Rapid thermal processing (RTP) describes a family of single wafer hot processes that have been developed to minimize dopant redistribution by allowing brief time at high temperature.2 One of the main technological hurdles that RTP must overcome is that of heating the wafers uniformly. Since the desired heating and cooling rates are often as high as 100 K/s, temperature gradients in the wafer due to nonuniformities either in the heat flux or in the various cooling mechanisms cannot be sufficiently smeared by thermal conduction. The resultant thermal stresses can exceed the plastic limit of the wafer and wafer warpage and slip dislocations may be induced. This problem is particularly pronounced along the edges of large wafers. Recent innovations in the RTP design have provided the ability to achieve temperature uniformity. One recent approach employs multiple concentric circular rings of lamps that can be controlled independently to adjust the heat flux over the wafer to maintain a reasonable uniform temperature over a range of operating conditions.3 In order to develop a process control scheme to manipulate the multilamp system to achieve spatial temperature uniformity while prespecified trajectories are tracked, a model describing the transient temperature field of the wafer is necessary. A straightforward approach to modeling an RTP system is to employ the unsteady heat conduction equation to describe the temperature field in the wafer. But control of partial differential equations such as the unsteady heat conduction equation is quite demanding computationally. The implementation of optimal control or model predictive control schemes requires repeated numerical solution of the governing partial differential equations, thus * Corresponding author.

one cannot expect real-time process control before devising a reliable dynamic model which can be solved efficiently. In the present work the Karhunen-Loe`ve-Galerkin procedure1 is used to derive a low dimensional dynamic model which simulates the transient temperature field in the semiconductor wafer. The low-dimensional dynamic model shall be employed in conjunction with a conjugate gradient technique to solve the boundary optimal control problem which ensures the transient temperature uniformity in the rapid thermal processing system. The Karhunen-Loe`ve-Galerkin procedure is a type of Galerkin method, which employs the empirical eigenfunctions of the Karhunen-Loe`ve decomposition as basis functions. Originally, the Karhunen-Loe`ve decomposition had been devised as a rational technique enabling a stochastic field (ensemble of snapshots) to be represented with a minimum degree of freedom.4 But in the present work, we exploit it as an efficient means of obtaining empirical eigenfunctions with which one can represent the heat conduction equation for a range of control variable with a minimum number of ordinary differential equations. If the Karhunen-Loe`ve decomposition is applied to a set of solutions attainable with a given range of control variable, we get a set of empirical eigenfunctions. Employing these empirical eigenfunctions as basis functions of a Galerkin procedure, one can a priori limit the function space considered to the smallest linear subspace that is sufficient to describe the observed solutions and consequently reduce the heat conduction equation to a minimal set of ordinary differential equations. This technique is called the Karhunen-Loe`ve-Galerkin procedure. Contrary to the traditional Galerkin methods, which employ as basis functions trigonometric functions or other special function that are independent of the characteristics of a given system, the present technique employs as basis functions the empirical eigenfunctions of the Karhunen-Loe`ve decomposition which fully take care of the governing equation of the system under consideration and thus reduces the heat conduction equation to a dynamic model of small degree of freedom easily. Then sophisticated control techniques can be applied rigorously without undue mathematical and computational complexities to this set having a small number of ordinary differential equations. The Karhunen-Loe`ve-

10.1021/ie990017u CCC: $18.00 © 1999 American Chemical Society Published on Web 09/03/1999

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999 3965

Galerkin procedure is not restricted by geometric complexities and can treat irregular geometries as easily as the regular ones. Recently, the Karhunen-Loe`veGalerkin procedure has been successfully applied to the inverse heat transfer problems5 and boundary control of Burgers equation6 which mimics the viscous flow. Also, Theodoropoulou, Adomaitis and Zafiriou7 and Baker and Christofides8 applied similar techniques to the rapid thermal chemical vapor deposition systems to derive reduced order models.

The function that maximizes λ of eq 5 is the eigenfunction of eq 11 with the largest eigenvalue. The SchmidtHilbert technique9 may be employed to solve eq 11 since the kernel of the integral equation is degenerate. The eigenfunction φ(x) is expressed linearly in terms of the N snapshots:

Karhunen-Loe` ve-Galerkin Procedure

Substituting eq 12 into eq 11 yields the following matrix eigenvalue problem

To make this paper self-contained, we introduce the essence of the Karhunen-Loe`ve decomposition. Consider a set of N arbitrary irregularly shaped functions. We call the ensemble of these functions {vn}, n ) 1, 2, ..., N, “snapshots”. The issue at hand is to obtain the most typical or characteristic structure φ(x) among these snapshots {vn}. The following mathematical notations are introduced:

vn(x) is a function, or a snapshot, in L2(Ω)

(1)

{vn} is an ensemble of snapshots

(2)

(f,g) ≡

∫Ωf(x) g(x) dΩ is the inner product in L2

(3)

〈vn〉 ≡ 1

N

∑ vn(x) is the ensemble average of snapshots

Nn)1

(4)

Then our objective is equivalently expressed so as to find a function φ(x) such that

maximize λ )

〈(φ,vn)2〉 (φ,φ)

(5)

The numerator of eq 5 may be rewritten as follows:

〈(φ,vn)2〉 )

∫Ω{∫Ω〈vn(x)vn(x′)〉 φ(x) dx} φ(x′) dx′

(6)

Equation 6 can be represented compactly by introducing the following two-point correlation function K(x,x′) and an integral operator R having K(x,x′) as the kernel:

K(x,x′) ≡ 〈vn(x)vn(x′)〉 ) R• ≡

1

N

∑ vn(x) vTn (x′) Nn)1

∫ΩK(x,x′)‚dx′

(7) (8)

Equation 6 is now expressed as

〈(φ,vn)2〉 )

∫Ω{Rφ}{φ} dx ) (Rφ,φ)

(9)

Upon substituting eq 9 into eq 5 it is found that the maximization problem of eq 5 is equivalent to the following eigenvalue problem:

Rφ ) λφ

(10)

∫ΩK(x,x′) φ(x′) dx′ ) λφ(x)

(11)

Or

N

φ(x) )

∑ Rkvk(x)

(12)

k)1

CnkRk ) λRn

(13)

where

Cnk )

1 N

∫ΩvTn (x′) vk(x′) dx′

(14)

The eigenvector R of the matrix eigenvalue problem, eq 13, is then substituted into eq 12 to generate a set of empirical eigenfunctions φ(x). The rigorous proof of this procedure can be found in the work of Courant and Hilbert.9 Let us express the eigenvalues λ1 > λ2 > ... > λN and the corresponding eigenfunctions φ1, φ2, ..., φN in the order of magnitude of eigenvalues. The eigenfunction φ1 corresponding to the largest eigenvalue λ1 is the most typical structure of the members of the snapshots {vn} and the eigenfunction φ2 with the next largest eigenvalue λ2 is the next typical structure, and so forth. The span of these eigenfunctions is exactly the span of all the realization of snapshots. Thus, any feasible solution can be represented as a linear combination of these eigenfunctions. The set of these empirical eigenfunctions can be made complete in L2 by including all eigenfunctions with zero eigenvalues. But the eigenfunctions with zero eigenvalue denote solution structures which are impossible for the range of control variables under consideration. Therefore, the set of empirical eigenfunctions has no difficulties in spanning the whole realizable solution space. Because the kernel K(x,x′) of the integral equation, eq 11, is symmetric, the empirical eigenfunctions satisfy the following orthogonality relationship:

∫Ωφi(x) φj(x) dx ) 0

(if i * j)

(15)

Thermal Modeling of the RTP System As mentioned previously, one of the principal problems with RTP is thermal uniformity. Various RTP systems have been developed that use various chamber and heating designs to improve the thermal uniformity in RTP.10-12 In the present investigation we consider a hypothetical axisymmetric RTP system with independently controlled concentric circular rings of lamps as depicted in Figure 1. This is in fact a simplification of the system of Moslehi, et al.12 We adopt this simplified system to facilitate a clearer presentation of the Karhunen-Loe`ve-Galerkin procedure as applied to the optimal control of the RTP systems. The upper surface of the wafer is heated by radiation from the lamps, which are partitioned into several circular concentric

3966

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999

tions are

FCp

(

(17)

t ) 0, T ) Ti - Ta

(18)

r ) 0,

zones. The radiation intensity of the lamps in each zone is controlled independently so that the edge of the wafer, where the heat loss is greater, may receive excess radiation. The wafer and chamber surfaces are modeled as gray and diffuse, so radiative heat transfer can be computed using view factors.13 In addition to the dominant radiative heat transfer from the lamps to the upper surface of the wafer, there are additional radiative heat exchanges between the wafer surfaces (upper, lower and edge) and chamber walls. Thus, the radiative heat transfer from the lamps can indirectly influence the heat flux at the lower surface and the edge of the wafer. These radiative heat exchanges are obtained by solving the enclosure problem to be described below. Our strategy is to separate the radiative heat exchange governed by the enclosure problem from the heat conduction within the wafer. Then the control variables, which are the powers of the lamps, influence the temperature distribution in the wafer through the radiative heat flux at the wafer surfaces. In the present work, only the dominant radiative transfer from the lamps to the upper surface of the wafer is considered explicitly, and the minor radiative exchanges between the wafer surfaces (lower and edge) and the chamber walls are taken care of by the effective heat transfer coefficients. Since the effective heat transfer coefficient is determined by radiation as well as convection, it depends on temperature and its value may be slightly adjusted as the wafer temperature varies. Of course, we can treat the radiative exchanges between the wafer surfaces (lower and edge) and the chamber wall explicitly without introducing the effective heat transfer coefficient. But in that case, the condition number of the matrix in the enclosure problem shall become larger and, as a result, the control scheme shall become less robust. The heat transfer within the wafer is governed by the heat conduction equation. We define the deviation temperature T as the difference between the actual temperature T′ and the ambient temperature in the chamber Ta as:

T ) T′ - Ta

(16)

With the assumption of axisymmetry, the governing equation and relevant initial and boundary condi-

( )

∂T )0 ∂r

(19)

r ) R, k

∂T ) -heT ∂r

(20)

z ) 0, k

∂T ) h′wT ∂z

(21)

∂T ) q(r,t) - h′′wT ∂z

(22)

z ) Z, k

Figure 1. The system.

)

∂T 1 ∂ ∂T ∂2T )k r + 2 ∂t r ∂r ∂r ∂z

where F is the density of the wafer, Cp is the heat capacity, k is the thermal conductivity, R is the radius of the wafer, Z is the thickness of the wafer, Ti is the initial temperature, he is the effective heat transfer coefficient at the edge, h′w is the effective heat transfer coefficient at the lower surface of the wafer, h′′ is the convective heat transfer coefficient at the upper surface of the wafer, and finally q(r,t) is the net radiative heat flux to the upper surface of the wafer. Since the wafer is thin enough that the axial temperature gradient can be neglected, eq 17 may be integrated vertically to yield the following one-dimensional unsteady heat conduction equation governing the vertically averaged temperature field

FCp

q(r,t) 2hwT ∂T k ∂ ∂T ) r + ∂t r ∂r ∂r Z Z

( )

(23)

where hw ) (h′w + h′′w)/2 The relevant initial and boundary conditions for eq 23 is eqs 18-20. The boundary conditions (21 and 22) are incorporated into the second and the third terms in the right hand site of eq 23. The radiative heat flux q(r,t) is determined by solving the following enclosure problem that treats the radiative heat exchanges among the wafer surfaces, the chamber walls, and the lamps. The analysis of radiative heat exchange can be simplified significantly by dividing the wafer surfaces and the chamber wall into a finite number of zones and by assuming that the following conditions are satisfied at the surface of each zone: (1) The radiative properties are uniform and independent of direction; (2) The temperature and heat flux are uniform over the surface of each zone; (3) The surfaces are diffuse emitters and diffuse reflectors; (4) The radiosity, which is the radiant energy leaving the surface, is uniform over the surface of each zone; (5) The surfaces are opaque; and (6) The radiative properties are independent of frequency, i.e., the gray-body approximation is adopted. Denoting the zones in the upper surface of the wafer with the index, i ) 1, 2, ..., M, the remaining zones including the lower surface of the wafer and chamber wall with the index, i ) M + 1, M + 2, ..., N and the lamp zones with i ) N + 1, N + 2, ..., N + L, we have the following set of equations.

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999 3967

For i ) 1, 2, ..., M (upper surface of the wafer) N

Ri ) iσT4i + Fi

RjFi-j + FiHoi ∑ j)1

(24)

N

q i ) Ri -

RjFi-j - Hoi ∑ j)1

(25)

For i ) M + 1, M + 2, ..., N (the edge of the wafer, the lower surface of the wafer and the chamber wall) N

Ri ) iσT4i + Fi

RjFi-j ∑ j)1

(26)

For each zone i, Ri is the radiosity, i is the emissivity, Fi is the reflectivity, Ti is the temperature, Fi-j is the view factor between the ith zone and the jth zone, qi is the net outward heat flux from the ith zone, σ is the Stefan-Boltzmann constant, and Hoi is the radiative flux from lamps to the ith zone in the upper surface of the wafer. The radiative flux from the lamps to the upper surface of the wafer is represented by L

Hoi )

∑ WikPk k)1

(27)

where Pk is the total radiation from the kth lamp and Wik is the view factor from the kth lamp to the ith zone in the upper surface of the wafer. Once we solve the boundary optimal control problem to determine the net radiative heat flux into the upper surface of the wafer, which ensures the transient temperature uniformity in the wafer, we obtain the heat flux (qi) and the temperature (Ti) in the zones of the surfaces of the wafer. Therefore Ti(i)1,2,...,M) in eq 24 and qi(i)1,2,...,M) in eq 25 are known as well as the temperatures of the chamber wall, the edge, and the lower surface of the wafer Ti(i)M+1,M+2,...,N) in eq 26. Rearranging these equations so that the unknown radiosity Ri(i)1,2,...,N) and Hoi(i)1,2,...,M) appear in the left hand side, we find

where PT ) (P1, P2, ..., PL), W ) [wik], and HTo ) (Ho1, Ho2, ..., HoM). In this way, we can separate the radiative exchange from the governing equation for the heat transfer in the wafer. Neumann Boundary Control of the Transient Temperature Field in the Wafer Since the radiative exchange can be treated separately without significant computational burden, as explained in the previous section, we concentrate on the governing equation of the temperature field in the wafer (eq 23) from now on. Consider the circular domain Ω with thickness Z shown in Figure 1b. The domain of the system is a thin disk of radius R ) 7.5 × 10-2 m and the thickness Z ) 10-3 m with the physical properties of F ) 2330 kg/m3, Cp ) 703 J/kgK, k ) 153 W/mK, and he ) 22.8 W/m2 K. Denoting the mean of the convective heat transfer coefficient at the upper surface of the wafer h′′w and the effective heat transfer coefficient at the lower surface of the wafer h′w as hw, it is assumed to vary radially as follows.

hw ) hi + (ho - hi)

(28)

where J is given by

[δij - Fi-j]Rj - Hoi ) qi(i)1,2,...,M) ∑ j)1

(29)

J(q) ) 1 tf 2 0

[δij - FiFi-j]Rj ) iσT4i (i)M+1,M+2,...,N) ∑ j)1

(30)

N

(33)

∑)

∫ ∫Ω[T(r,t) - Tf(t)]2 dΩ dt + 2 ∫0t ∫Γq(r,t)2 dΓ dt f

(34)

N

The set of eqs 28-30 is composed of N + M linear equations with N + M unknowns, i.e., Ri(i)1,2,...,N) and Hoi(i)1,2,...,M). The inversion of the matrix to obtain Hoi is very cheap, since the coefficients of the matrix are constants; we only have to invert it once at the beginning and store the inverse which is to be used repeatedly later. Once Hoi(i)1,2,...,M) is known, the strengths of the lamps Pk(k)1,2,...,L) required by the optimal control policy are obtained in the least square sense as follows:

PT ) (WTW)-1WTHo

(32)

min J(q) q ∈ L2(

[δij - FiFi-j]Rj - FiHoi ) ∑ j)1

4

where hi ) 28.4 W/m2 K and ho ) 45.6 W/m2 K. As mentioned previously, the effective heat transfer coefficients, he and hw, vary slightly as the wafer temperature changes since they take care of the radiative heat transfer as well as the convective transfer. But this dependence may be safely neglected within the overall accuracy of the model. According to eq 32 the convective and radiative cooling from the wafer is larger near the edge than at the center. Control q(r,t) will act on the upper surface through the Neumann boundary conditions. The object of the control is to heat or cool the wafer tracking prespecified trajectories while spatial uniformity in the temperature field is maintained. We search for the Neumann control q(r,t) that solves the following optimization problem:

N

iσT4i (i)1,2,...,M)

(Rr )

(31)

and T(r,t) satisfies eq 23 with the relevant initial and boundary conditions, (18-20). The initial value of the state variable, Ti(r), is given in L2(Ω), where the space L2(Ω) is defined by

∫Ω|f|2 dΩ < ∞}

L2(Ω) ≡ {f|

(35)

The upper surface of the wafer is denoted by Γ, Tf(t) is the prespecified target temperature, which is timevarying and spatially uniform, and tf is the final process time. The parameter  in eq 34 is a positive weighting factor which represents the expense of applying the control. The magnitude of  is small if the control is cheap and large if it is expensive. In practice, the power for each lamp ring is bounded between zero and an upper limit and, therefore, the heat

3968

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999

flux q is also bounded. The permissible range of q can be obtained as follows. Equation 27 gives the bounds for Hoi if the bounds for the lamp power Pk are known. From the linear set of equations 28-30, we can represent qi(i)1,2,...,M) as a linear combination of Hoi(i)1,2,...,M) and σiT4i (i)1,2,...,N). Therefore, we can easily find the bounds for qi, which depend on σiT4i as well as the view factors Fi-j. These inequality constraints for qi may be incorporated in the optimization procedure by adding a penalty function to eq 34 and minimizing the resulting objective function using any technique of unconstrained optimization. However, the boundedness of the lamp power is not taken into account a priori in the present investigation to make the analysis simpler, since the optimal heat fluxes q(r,t) obtained are found not to violate the boundedness condition of the lamp power. In case the inequality constraints for q(r,t) imposed by the boundedness of the lamp power become active, it is easy to add the penalty term to the objective function and continue the unconstrained optimization procedure. Equation 23 is solved by a finite difference method employing 500 grid points, which is found to be sufficient to resolve the temperature field considered in the present investigation.

Figure 2. Definition of shape functions: (a) temporal shape functions and (b) spatial shape functions.

Construction of Empirical Eigenfunctions The low-dimensional dynamic model obtained through the Karhunen-Loe`ve-Galerkin procedure must predict the system behavior accurately for various trajectories of the boundary heat flux q(r,t) that include not only the optimal trajectory but also other trajectories appearing during the iterative minimization of the objective function eq 34 by means of the conjugate gradient method. The prerequisite to this requirement is that the empirical eigenfunctions employed in the KarhunenLoe`ve-Galerkin procedure must span the admissible solution space of the heat conduction equation for these various trajectories of the boundary heat flux q(r,t). According to the Schmidt-Hilbert theory9 the empirical eigenfunctions can be expressed linearly in terms of snapshots as shown in eq 12. Therefore, once we obtain an ensemble of snapshots {Tn(r)} that encompasses the admissible solution space of the heat conduction equation for the above trajectories of the boundary heat flux q(r,t), the resulting empirical eigenfunctions obtained through the Karhunen-Loe`ve decomposition shall span the same admissible solution space, and consequently the low-dimensional model shall simulate the system accurately for those various trajectories of the boundary heat flux function q(r,t). Thus, the snapshots of the system have been obtained in the following way. As the first step, we transform the continuous control function q(r,t) into discrete variables as follows: M

q(r,t) )

N

∑ ∑ Rmnψm(t) ψn(r)

(36)

m)1 n)1

where ψm(t) is the mth time shape function employed to discretize the time variable, ψn(r) is the nth space shape function employed to discretize the space variable r, M is the number of time shape functions, and N is the number of space shape functions employed. In the present work, we adopt M ) 11 and N ) 11. Figure 2 depicts these shape functions. Next, we solve the system

Figure 3. Some dominant empirical eigenfunctions with large eigenvalues: (a) the first eigenfunction (λ1 ) 0.721), (b) the second eigenfunction (λ2 ) 0.199), (c) the third eigenfunction (λ3 ) 5.331 × 10-2), and (d) the fourth eigenfunction (λ4 ) 1.593 × 10-2).

(eq 23) with q(r,t) ) ψn(r) and store 100 temperature snapshots at an appropriate time interval until a steady state is reached. The Karhunen-Loe`ve decomposition is then applied to this set of snapshots to yield empirical eigenfunctions {φ(n) k }, where superscript n designates the fact that these empirical eigenfunctions are obtained from the system with q(x,t) ) ψn(x). We repeat the above procedure for n ) 1, 2, ..., N to obtain N sets of empirical (2) (N) eigenfunctions, i.e, {φ(1) k }, {φk }, ..., {φk }. Each set (n) {φk } consists of 100 empirical eigenfunctions. Finally, we choose 10 dominant eigenfunctions from each of these N sets to make an ensemble of (10 × N) snapshots. To this set of (10 × N) snapshots, we apply the Karhunen-Loe`ve decomposition again to obtain the final set of empirical eigenfunctions {φk} to be employed in the construction of the low-dimensional dynamic model. Figure 3a-d shows the first, second, third, and fourth empirical eigenfunctions with the corresponding normalized eigenvalues λ1 ) 0.721, λ2 ) 0.199, λ3 ) 5.331 × 10-2, and λ4 ) 1.593 × 10-2, respectively. Also shown in Figure 4a-d are some typical eigenfunctions with smaller eigenvalues, i.e, the 22nd, the 23rd, the 24th, and the 25th eigenfunctions with the corresponding normalized eigenvalues, λ22 ) 4.260 × 10-10, λ23 ) 3.736 × 10-10, λ24 ) 2.823 × 10-10, and λ25 ) 2.443 ×

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999 3969

Figure 4. Some typical empirical eigenfunctions with small eigenvalues: (a) the 22nd eigenfunction (λ22 ) 4.260 × 10-10), (b) the 23rd eigenfunction (λ23 ) 3.736 × 10-10), (c) the 24th eigenfunction (λ24 ) 2.823 × 10-10), and (d) the 25th eigenfunction (λ25 ) 2.443 × 10-10).

10-10. Figures 3 and 4 reveal that the dominant empirical eigenfunctions represent the large scale structures of the temperature field, while the eigenfunctions with small eigenvalues represent the small scale structures.

Figure 5. Relative errors of the low-dimensional dynamic model with respect to the finite difference solution for various heat flux functions q(r,t).

The relevant initial conditions for the system of ordinary differential equations (eq 39) are the following.

The Low Dimensional Dynamic Model We represent the temperature field T(r,t) as a linear combination of the empirical eigenfunctions as follows NT

T(r,t) )

ai(t) φi(r) ∑ i)1

(37)

where φi is the ith empirical eigenfunction, ai(t) is the corresponding spectral coefficient, and NT is the total number of the empirical eigenfunctions employed in the Karhunen-Loe`ve-Galerkin procedure. The residual is expressed as

R ≡ FCp

q(r,t) 2hw ∂T k ∂ ∂T r + T ∂t r ∂r ∂r Z Z

( )

(38)

Applying the Galerkin principle which enforces the residual to be orthogonal to each of the trial functions and exploiting the boundary conditions we find that

FCpMj

daj dt

NT

+

∑ i)1

NT

aj(t)0) )

1

2

NT

∑aiNij ) 0

Zi)1

(39)

where

(j ) 1, 2, ..., NT)

(44)

To corroborate the accuracy of the low-dimensional model, eq 39 is solved for various heat flux functions q(r,t), and the resulting temperature fields are compared with those obtained by the finite difference method. Usually the error of the low-dimensional model decreases as the number of empirical eigenfunctions employed increases up to the optimal number. But further increase of number of empirical eigenfunctions beyond the optimal number does not always improve the accuracy because the empirical eigenfunctions with very small eigenvalues are contaminated with roundoff errors. The optimal number of eigenfunctions for the set (39) is found to be 25. Thus, the low-dimensional dynamic model adopted in the sequel is constructed by using 25 empirical eigenfunctions. The following four functions have been considered as the heat flux function q(r,t) to examine the accuracy of the low-dimensional dynamic model.

aiHji - ∫Ωq(r,t)φj dΩ + ∑ Z i)1

aiPij + k

∫ΩφjT(r,t)0) dΩ ∫Ωφ2j dΩ

(a) q(r,t) ) f(r) g(t)

(45)

(b) q(r,t) ) f(r) sin(t/tf)

(46)

(c) q(r,t) ) sin(2πr) g(t)

(47)

(d) q(r,t) ) sin(2πr) sin(t/tf)

(48)

f(t) ) 5r + 1

(49)

(50)

where

Mj ≡

∫φ2j dΩ

Pij ≡ he(2πRZ) φi(R) φj(R)

(41)

and

∫Ω ∂ri ∂rj dΩ

(42)

g(x) ) 50000t + 50000

∫Ωhw(r)φiφj dΩ

(43)

Figure 5 displays the variation of relative errors of the low-dimensional dynamic model for various heat flux

Hji ≡ Nji ≡

∂φ ∂φ

(40)

3970

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999

functions, (45-48). The relative error is less than 0.02% for all cases.

where

[( ) ] ∑ ∑ [( ) ]

M

φ)

In this section, we describe how to solve the optimal control problem posed in eqs 33 and 34 employing the heat conduction equation. Once the continuous control function q(r,t) is transformed into discrete variables Rmn by eq 36, the problem at hand is to select the vector r ) (R11, R12, ..., R1N, R21, R22, ..., RM1, RM2, ..., RMN) such that the objective function (eq 34) shall be minimized. The conjugate gradient method based on the FletcherReeves algorithm14 is employed in the iterative minimization of the objective function. The variation of the objective function with respect to the vector r is determined by the following gradient vector

[ ] ( ∂J ∂r

T

∂J ∂J ∂J , , ..., ∂R11 ∂R12 ∂RMN

)

)

∂J ) ∂Rmn

∂J

M

(i+1) 2

∂Rmn

m)1n)1

N

∂J

(58)

(i) 2

∂Rmn

m)1n)1

On the other hand, the search direction d at the first step is computed by

∂J ∂r

d0 ) -

(59)

This renewed d(i+1) is used for the search direction at the next iterative state. The parameter vector r(i+1) at the (i + 1)th iterative state is obtained from r(i) through moving in the conjugate direction d(i).

r(i+l) ) r(i) + Fd(i)

(51)

(60)

The optimal step length F is determined by minimizing J(r+Fd) with respect to F. Exploiting the following formula

where

[ ]

N

∑∑

Solutions of the Optimal Control Problem Employing the Original Partial Differential Equation

∫0t ∫Ω[T(r,tf) - Tf(t)]θ(tf;m,n) dΩ dt + f



∫0 ∫Γq(r,t) ψm(t) ψn(r) dΓ dt tf

M

(52)

∂T(r,t) ∂Rmn

(53)

The function θ(t;m,n) represents the variation of the temperature field with respect to the (m,n)th element of the vector r. The governing equation for θ(t;m,n) is obtained by partially differentiating eq 23 with respect to the vector r. For values of m ) 1, 2, ..., M and n ) 1, 2, ..., N we have the following set of sensitivity equations

ψm(t) ψn(r) ∂ 1 ∂ ∂ r θ(t;m,n) + FCp θ(t;m,n) ) k ∂t r ∂r ∂r Z 2hw θ( t;m,n) (54) Z

( (

))

F)-

(55)

+ φd(i) (i G 1)

(63)

A)

M

N

∫0 ∫ ∑ ∑ θ(t;m,n)dmn)2dΩ dt + m)1 n)1 tf

( Ω

M

N

f

(64)

and

B) M

∫0 ∫ tf

(T(r,t) - Tf(t)) ( Ω M

(56)

The conjugate direction or the search direction d is calculated and renewed by the Fletcher-Reeves method (i+1)

B A

where

∂ θ(t;m,n) ) 0 at r ) 0 ∂r ∂ θ(t;m,n) ) heθ(t;m,n) at r ) R ∂r

(61)

we find, after some length algebraic manipulation, the expression for the optimal step length.

∫0t ∫Γ( ∑ ∑ dmnψmψn)2dΓ dt m)1 n)1

and the boundary conditions

[∂r∂J]

Fdmn

dT ) (d11, d12, ..., d1N, d21, ..., d2N, ..., dM1, ..., dMN) (62)



θ(0;m,n) ) 0

d(i+1) ) -

∂T

∑∑ m)1 n)1 ∂R

mn

with the initial condition

-k

N

where dmn is the (m,n)th component of the conjugate direction vector d, i.e,

For the brevity of notation, we denote

θ(t;m,n) ) θ(r,t,m,n) )

T(r + Fd) ) T(r) +

(57)

N

N

∑ ∑ θ(t;m,n)dmn) dΩ dt +

m)1 n)1 M N

∫0t ∫Γ( ∑ ∑ Rmnψmψn) ( ∑ ∑ dmnψmψn) dΓ dt m)1 n)1 m)1 n)1



f

(65)

The iterative process of the conjugate gradient method is summarized as follows: (1) Assume r. Then q(r,t) is determined by eq 36. Obtain T(r,t) and θ(t;m,n) by solving eqs 23 and 54, respectively.

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999 3971

(2) Calculate d0 ) -(∂J/∂r) by eq 59. (3) Determine the optimal step length by eqs 63-65. (4) Calculate r(i+1) by eq 60. (5) Obtain T(r(i+l) and θ(r(i+l); m,n) by solving eq 23 and 54, respectively. (6) Calculate φ using eq 58. (7) Calculate d(i+l) by eq 57. If |d(i+1)| < , then stop. (8) Set i ) i + 1 and go to the step 3.

The conjugate direction at the (i + 1)th iteration is determined by

J)

NT

∫ ∑(ai(t) 2 0 tf

aTi (t))2Mi



∫∫ q 2 t Γ

dt +

i)1

2

(71)

[( ) ] ∑ ∑ [( ) ]

M

N

∂Rmn

m)1n)1

φ)

M

(i+1) 2

∂J

∑∑

If the same algorithm of optimization is applied to the low-dimensional dynamic model instead of the heat conduction equation, the minimization of the objective function can be done very efficiently. The degree of freedom of the low-dimensional dynamic model (eq 39) is only 25, whereas that of the heat conduction equation, which is equivalent to the grid number in the numerical solution, is above 500. The drastic reduction of the degree of freedom through the Karhunen-Loe`veGalerkin procedure shall correspondingly reduce the computer time in the solution of optimal control problem. The objective function (eq 34) is rewritten in terms of the spectral coefficients, ai, as follows

+ φd(i)

where

Solutions of the Optimal Control Problem Employing the Low Dimensional Dynamic Model

1

(i+1)

[∂r∂J]

d(i+1) ) -

N

m)1n)1

(72)

(i) 2

∂J

∂Rmn

and dT ≡ (d11, d12, ..., d1N, d21, ..., d2N, ..., dM1, ..., dMN). The parameter vector is updated in the conjugate direction according to the following equation.

r(i+l) ) r(i) + Fd(i)

(73)

The optimal step length F is determined by partially differentiating J(r + Fd) with respect to F and setting the resulting equation equal to zero

B A

F)dΓ dt (66)

(74)

where where Mi is given by eq 40 and aTi (t) is determined by the following equation:

A≡

∫ΩφiTf(t) dΩ aTi (t) ) ∫Ωφ2i dΩ

(i ) 1, 2, ..., NT)

(67)

∂J

∫0t ∑ j)1

)

Mj

∂Rmn

(

f

aj

∂aj ∂Rmn

- aTj

)

∂aj

dt +

-

d

∫t∫Γqψm(t) ψn(r) dΓ dt

( ) ( ) ∂aj

dt ∂Rmn

+

∂ai

∑ i)1 ∂R

1

mn

dt +

N

(

M

)

∂ai(t) dmn Mi dt +

N

∫0t ∑(ai(t) - aTi (t)) ∑ ∑ ∂R i)1 m)1 n)1 f

mn

M

(75)

N

∫t∫Γq ∑ ∑ ψmψn dmn dΓ dt m)1 n)1

(76)

∂Rmn

(68)

The sensitivity equations that determine ∂aj/∂Rmn are as follows:

FCpMj

M

2

∫t∫Γ( ∑ ∑ ψmψn dmn)2 dΓ dt m)1 n)1

NT

B≡

)

N





NT

M



As previously, the continuous control variable q ) (r,t) is discretized, according to eq 36, into the parameter vector rt ) (R11, R12, ..., RMN). The gradient of the objective function with respect to r is NT

(∑ ∑

∂ai(t) Mi dmn i)1 m)1 n)1 ∂Rmn NT

∫0 ∑ tf

( ) ∑( ) NT

Pij + k

∂ai

∑ i)1 ∂R

Hij

(69)

mn

2 NT ∂aj

∫ ψm(t) ψn(r)φj dΩ + Z Z Ω

i)1

∂Rmn

Nij ) 0

(j ) 1, 2, ..., NT; m ) 1, 2, ..., M; n ) 1, 2, ..., N) The initial conditions are

∂aj (t ) 0) ) 0 ∂Rmn (j ) 1, 2, ..., NT; m ) 1, 2, ..., M; n ) 1, 2, ..., N) (70)

The following relation has been exploited to derive eqs 74-76. M

ai(r + Fd) ) ai + F

N

∑∑

∂ai

m)1 n)1∂Rmn

dmn

(77)

The algorithm of the optimal control problem employing the low-dimensional dynamic model follows the procedure below: (1) Assume r ) (R11, R12, ..., RMN) and calculate ai(t) and (∂ai/∂Rmn)(t) (i ) 1, 2, ..., NT; m ) 1, 2, ..., M; n ) 1, 2, ..., N) using eqs 39 and 69, respectively. (2) Calculate d0 ) -[∂J/∂r] by eq 68. (3) Determine the optimal step length F by eqs 7476. (4) Update r(i+l) by eq 73. (5) Calculate ai(t) and (∂ai/∂amn)(t) (i ) 1, 2, ..., NT; m ) 1, 2, ..., M; n ) 1, 2, ..., N) using eqs 39 and 69, respectively. (6) Calculate φ using eq 72.

3972

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999

Figure 6. Convergence rate of the conjugate gradient method. (the FDM-CG and the KLG-CG).

Figure 7. Optimal heat up at 50 K/s. Heat flux profiles obtained either by the FDM-CG or by the KLG-CG are shown at t ) 1.0, 3.0 and 5.0 s. Three-dimensional plot of q(r,t) is also depicted.

(7) Calculate d(i+1) by eq 71. If |d(i+1)| < , then stop. (8) Set i ) i + 1 and go to the step 3. Results The boundary optimal control problem of determining the radiative heat flux at the upper surface of the wafer, q(r,t), which ensures the spatial uniformity in the transient temperature field of the wafer, is solved by the conventional method employing the original partial differential equation and by the Karhunen-Loe`veGalerkin procedure, respectively. For brevity of communication, we call the method employing the original partial differential equation the FDM-CG, while the method employing the low-dimensional dynamic model is called the KLG-CG. The accuracy and efficiency of the FDM-CG and the KLG-CG are assessed for various target trajectories, Tf(t), including ramp up at 50 and 100 K/s. Figure 6 shows the variation of J defined by eq 34 with respect to the iteration number of the conjugate gradient procedure employing various initial approximations of q(r,t), when  ) 1.0 × 10-14 and Tf(t) is ramp up at 50 K/s for 5 s. The initial value of Tf is 50 K higher than the ambient temperature. It is shown that the

convergence rates of the FDM-CG and the KLG-CG are almost the same for each of the initial approximations employed. Figure 6 also shows that the objective function decreases to the minimum value in 100 iterations in all cases. Figure 7 depicts the resulting optimal heat flux function q(r,t) obtained either by the FDM-CG or by the KLG-CG. It shows that both methods yield almost the same optimal profiles of heat flux. Higher heat flux is obtained near the edge to compensate for the larger convective heat loss compared to the center. As the wafer is heated up, the convective heat loss increases; therefore, the difference between heat flux at the edge and that at the center becomes more significant. Figure 8 shows the corresponding temperature field. Because the KLG-CG and the FDM-CG yield almost the same heat flux function, q(r,t), the resulting temperature fields are also indistinguishable. The temperature at the center of the wafer follows the target temperature Tf(t) exactly (Figure 8a). The spatial variation of temperature field during ramp up is represented using the relative error defined as follows.

error ) |

T(r,t) - Tf(t) | Tf(t)

(78)

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999 3973

Figure 10. Temperature error for the optimal heat up at 100 K/s.

Figure 8. Optimal heat up at 50 K/s: (a) temperature tracking at the center and (b) temperature error defined by eq 78.

Figure 9. Optimal heat-up at 100 K/s. Optimal heat flux profiles obtained either by the KLG-CG and by the FDM-CG are shown at t ) 1.0, 2.0, and 2.5 s.

This error is plotted in Figure 8b, which shows that initially there is a slight deviation of temperature from Tf(t), especially near the center. Afterward the temperature tracking is almost exact with spatial uniformity. Figures 9 and 10 show similar results for the case of ramp up at 100 K/s for 2.5 s. As in the case of ramp up at 50 K/s, Figure 9 shows that the KLG-CG and the

FDM-CG yield heat flux profiles which are indistinguishable. The temperature error given by eq 78 for this case is plotted in Figure 10, which shows that the spatial uniformity of the temperature field is maintained, even for the case of ramp up at 100 K/s. But the temperature error for this case is found to be larger than that for the case of ramp up at 50 K/s. Figures 11 and 12 are optimal heat flux and temperature error, respectively, for the case where the target temperature Tf(t) is maintained at 550 K for 1 s, ramp up at 100 K/s for 3 s and held at 850 K for 2 s. Figure 11 shows that the KLG-CG and the FDM-CG yield optimal heat flux profiles which are almost the same. The resulting temperature field is analyzed in Figure 12. Figure 12a shows the temperature tracking at the center of the wafer. Although spatial uniformity in the temperature field is maintained through the whole process, slight deviation from the target Tf(t) exists especially when the slope of Tf(t) changes at t ) 1 s and 4 s (cf. Figure 12b). It is easily expected that the requirement of CPU time for the case of KLG-CG is much less than that for the case of FDM-CG, since the degree of freedom of the former is only about 1/20 of the latter. In fact, when an Ultrasparc workstation is used, the CPU times requirement for the FDM-CG and the KLG-CG are about 450 and 42 min, respectively. Since the difference in the degree of freedom between the low-dimensional dynamic models and the original partial differential equations shall become much larger as the dimensionality of the problem increases, the reduction of CPU time with the use of KLG-CG instead of FDM-CG shall be much more significant in the case of the multidimensional problems. The CPU time required to construct the low-dimensional model is not significant compared with that needed for the FDM-CG. The preparation of the snapshots requires solving the heat conduction equation 11 times from t ) 0 to t ) tf, which consumes about 3 min. The remaining steps in the KarhunenLoe`ve-Galerkin procedure consume another 11 min. All in all the construction of the low-dimensional dynamic model requires about 14 min, which is much less than the 450 min needed in the iteration of the FDM-CG. Moreover, once the low-dimensional dynamic model is constructed, it can be used repeatedly for similar problems.

3974

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999

Figure 11. Optimal heat up: a 1 s hold at 550 K, a 100 K/s heat up for 3 s and a 2 s hold at 850 K. Heat flux profiles obtained either by the FDM-CG or by the KLG-CG are shown at t ) 1.0 s, 3.0 s, and 6.0 s. Three-dimensional plot of q(r,t) is also depicted.

hunen-Loe`ve-Galerkin procedure, which is a Galerkin method employing the empirical eigenfunctions of the Karhunen-Loe`ve decomposition as trial functions, limits the function space considered to the smallest linear subspace that is sufficient to describe the observed solutions and consequently reduces the heat conduction equation to a minimal set of ordinary differential equations. The optimal control scheme employing the low-dimensional dynamic model of the KarhunenLoe`ve-Galerkin procedure, called the KLG-CG, is compared with the traditional technique (called the FDMCG) and is found to yield accurate results at a drastically reduced computational cost. Nomenclature

Figure 12. (a) Temperature tracking at the center for the case of Figure 11 and (b) temperature error for the case of Figure 11.

Conclusion In RTP of semiconductor wafer, precise control of wafer temperature is required throughout the process cycle to minimize dopant redistribution as well as wafer warpage. The spatial uniformity of temperature necessary during heat up or cool down of the wafer is ensured by an appropriate scheme of boundary optimal control. The efficient implementation of the boundary optimal control requires a low-dimensional dynamic model which simulates the system accurately. The KarhunenLoe`ve-Galerkin procedure is employed to derive a lowdimensional dynamic model for this purpose. The Kar-

ai ) spectral coefficient premultiplying the ith empirical eigenfunction aTi ) spectral coefficient given by eq 67 Cnk ) matrix defined in eq 14 Cp ) heat capacity d ) conjugate direction vector Fi-j ) view factor between the ith and the jth zones Hij ) matrix defined in eq 42 he ) effective heat transfer coefficient at the edge of the wafer h′w ) effective heat transfer coefficient at the lower surface of the wafer h′′w ) convective heat transfer coefficient at the upper surface of the wafer hw ) defined as (h′w + h′′w)/2 J ) performance function defined in eq 34 K(x,x′) ) two-point correlation function hw ) defined as (h′w + h′′w)/2 k ) thermal conductivity M ) number of temporal shape functions employed in representing the heat flux function Mj ) vector defined in eq 40 N ) number of spatial shape functions employed in representing the heat flux function NT ) total number of empirical eigenfunctions employed in the Karhunen-Loe`ve-Galerkin procedure Pk ) total radiation from the kth lamp

Ind. Eng. Chem. Res., Vol. 38, No. 10, 1999 3975 q(r,t) ) radiative heat flux into the wafer qi ) net outward radiative flux (eq 25) R ) radius of the wafer Ri ) radiosity from the ith zone T ) deviation temperature ()T′ - Ta) T′ ) actual temperature Ta ) ambient temperature Ti ) initial temperature tf ) final process time vn(x) ) a function, or a snapshot, in L2(Ω) {vn} ) ensemble of snapshots 〈vn〉 ) ensemble average of snapshots z ) thickness of the wafer Greek Symbols R ) parameter vector defined in eq 36 Γ ) upper surface of the wafer θ ) sensitivity function defined in eq 53 F ) optimal step length in the conjugate gradient method defined in eq 63 and 74 Fi ) reflectivity at the ith zone φi(r) ) ith empirical eigenfunction ψm(t) ) mth temporal shape function ψn(x) ) nth spatial shape function Ω ) system domain

Literature Cited (1) Park, H. M.; Cho, D. H. The Use of the Karhunen-Loe`ve Decomposition for the Modeling of Distributed Parameter Systems. Chem. Eng. Sci. 1996, 51, 81. (2) Campbell, S. A. The Science and Engineering of Microelectronic Fabrication; Oxford University Press: Oxford, 1996. (3) Norman, S.; Schaper, C.; Boyd, S. Improvement of Temperature Uniformity in Rapid Thermal Processing Systems Using

Multivariable Control. In Material Research Society Proceedings: Rapid Thermal and Integrated Processing; Materials Research Society: 1991; Vol. 224. (4) Loe`ve, M. Probability Theory; Van Nostrand: Princeton, N.J., 1955. (5) Park, H. M.; Lee, J. H. A Method of Solving Inverse Convection Problems by Means of Mode Reduction. Chem. Eng. Sci. 1998, 53, 1731. (6) Park, H. M.; Lee, M. W.; Jang, Y. D. An Efficient Computational Method of Boundary Control Problems for the Burgers Equation. Comput. Methods Appl. Mech. Eng. 1998, 166, 289. (7) Theodoropoulou, A.; Adomaitis, R. A.; Zafiriou, E. Model Reduction for Optimization of Rapid Thermal Chemical Vapor Deposition Systems. IEEE Trans. Sem. Manuf. 1998, 11, 85. (8) Baker, J.; Christofides, P. D. Nonlinear Control of Rapid Thermal Chemical Vapor Deposition Under Uncertainty. Comput. Chem. Eng. 1999, 23, 233. (9) Courant, R.; Hilbert, D. Methods of Mathematical Physics; Interscience Publishers, Inc.: New York, 1953, Vol. 1. (10) Roozeboom, F.; Parekh, N. Rapid Thermal Processing System: A Review with Emphasis on Temperature Control. J. Vac. Sci. Technol. 1990, B8, 1249. (11) Dettart, W. Microelectronic Man. Tchnol. 1991, 14, 44. (12) Moslehi, M. M.; Kuehne, J.; Yeakley, L.; Velo, H.; Najm, B.; Dostalik, B.; Yin, D.; Davis, C. J. In-situ Fabrication and Process Control Techniques in Rapid Thermal Processing. Materials Sci. Symp. Proc. 1991, 224, 143. (13) O ¨ zisik, M. N. Radiative Transfer and Interactions with Conduction and Convection; Wiley: New York, 1972. (14) Fletcher, R.; Reeves, R. M. Function Minimization by Conjugate Gradients. Comput. J. 1964, 7, 149.

Received for review January 5, 1999 Revised manuscript received July 1, 1999 Accepted July 9, 1999 IE990017U