Hybrid Mathematical Programming Discrete ... - ACS Publications

results showing better performance when compared to other published approaches. .... An improvement-based MILP optimization approach to complex AW...
0 downloads 0 Views 5MB Size
ARTICLE pubs.acs.org/IECR

Hybrid Mathematical Programming Discrete-Event Simulation Approach for Large-Scale Scheduling Problems Pedro M. Castro,*,† Adrian M. Aguirre,‡ Luis J. Zeballos,†,§ and Carlos A. Mendez‡ †

Unidade Modelac-~ao e Optimizac-~ao de Sistemas Energeticos, Laboratorio Nacional de Energia e Geologia, 1649-038 Lisboa, Portugal INTEC (Universidad Nacional del Litoral  CONICET), Santa Fe, Argentina § Universidad Nacional del Litoral  Facultad de Ingeniería Química, Santa Fe, Argentina ‡

ABSTRACT: This article presents a new algorithm for industrially sized problems that, because of the large number of tasks to schedule, are either intractable or result in poor solutions when solved with full-space mathematical programming approaches. Focus is set on a special type of multistage batch plant featuring a single unit per stage, zero-wait storage policies, and a single transportation device for moving lots between stages. The algorithm incorporates a mixed-integer linear programming (MILP) continuous-time formulation and a discrete-event simulation model to generate a detailed schedule. More precisely, three stages are involved: (i) finding the best processing sequence, assuming that the transportation device is always available; (ii) generating a feasible schedule, taking into account the shared transportation resource; (iii) improving the schedule through a neighborhood search procedure. Relaxed and constrained versions of the full-space MILP are involved in stages (i) and (iii) with the simulation model taking care of stage (ii). Several examples are solved to illustrate the capabilities of the proposed method with the results showing better performance when compared to other published approaches. The balance between solution quality and total computational effort can easily be shifted by changing the number of lots rescheduled per iteration.

1. INTRODUCTION The industries of today are immersed in highly competitive global markets that continuously demand enhanced performance in key indicators such as customer satisfaction, manufacturing and inventory holding costs, and delivery times. Those facing challenging production environments easily recognize the opportunity for improvement from advanced automated scheduling tools, even if it is sometimes difficult to accurately estimate the impact improved scheduling can provide.1 Yet, the most popular tool is still Microsoft Excel, with experienced individuals deriving the schedules manually. This is probably due to the uniqueness of every scheduling problem that has held back the development of a truly general approach. Despite the significant developments in modeling techniques that have occurred in the last two decades, new academic developments are mostly tested on laboratory-scale examples, with many focusing on the solution of variants of the problems first described by Kondili et al.2 Scheduling problems from multipurpose batch plants are indeed quite complex, because of the arbitrary network configuration of equipment resources, the need to account for multiple materials that will be transformed and combined into the final products, and a variety of storage policies, to name a few. The decisions involved concern batch sizing, as well as assignment of batches to equipment units and their sequencing.3,4 These decisions are highly interconnected, making it difficult to develop efficient decomposition approaches, which must be systematic, featuring tunable parameters that allow one to maintain the number of decisions at a reasonable level for varying problem sizes, in order to be attractive to real-world applications. Real-life problems should also take advantage of state-of-the-art scheduling models that have come quite close to rigorously handle all the features that may be encountered at a process plant. The main goal of r 2011 American Chemical Society

optimization-based large-scale scheduling approaches is to be able to generate good schedules with modest computational effort. In contrast to more academic research with so-called “full-space” models, guaranteeing optimality looses importance, primarily because only a subset of the scheduling goals are actually taken into account and implementing the output schedule as such is often limited by the dynamic nature of industrial environments. Floudas and co-workers57 have effectively tackled the largescale scheduling problem of multipurpose batch plants with shared equipment units, using a decomposition approach based on the well-known unit-specific continuous-time formulation by Ierapetritou and Floudas.8 The industrial case study from ATOFINA Chemicals5 featured 35 different products, resulting from a basic three-stage recipe involving the sharing of 10 pieces of equipment. The larger case study from BASF6,7 involved the recipes of hundreds of products and over 80 pieces of equipment and required an upgrade of the previous approach. In turn, Castro et al.9 have examined a case study from one of ABB’s customers that was concerned with reducing the energy bill under volatile electricity availability and pricing. The proposed decomposition algorithm combined Resource-Task Networkbased discrete and continuous-time10 models. Scheduling applications at Dow Chemical1,11 have been relying exclusively on discrete-time models. In most multistage batch plants, lot identity is maintained throughout the processing stages, making it easier to decouple batching from assignment and sequencing decisions. Rosl€of et al.12 Received: April 19, 2011 Accepted: August 11, 2011 Revised: July 22, 2011 Published: August 11, 2011 10665

dx.doi.org/10.1021/ie200841a | Ind. Eng. Chem. Res. 2011, 50, 10665–10680

Industrial & Engineering Chemistry Research proposed a systematic method for large-scale problems with a single unit to either derive a schedule from scratch, through a constructive procedure, or reschedule toward optimality. Working with a continuous-time model with sequencing variables, they insert new (or release a subset of) lots while maintaining the relative position of the other lots in the schedule, thus successfully keeping complexity at a manageable level. In other words, the user can somewhat balance solution quality with total computational effort through the chosen number of lots to reschedule per iteration. Mendez and Cerda13 extended the work to allow changing the assignment decisions in problems with units in parallel. The selection of lots to (re)schedule followed an increasing slack time rule. Castro et al.14,15 also tested the earliest due date preordering heuristic with algorithms that allow one to choose between time slot and sequencing variables16 models for the iterative search procedure. A few alternative unit-specific, time slot models could have been used,1719 but the minimum event points approach by Castro and Grossmann17 ensured the requirement of a single time slot per lot and per stage. All four exact methods1619 can readily be tested in the CMU-IBM Cyberinfrastructure for MINLP (http://minlp.org). A comparison between alternative models relying on different time representations is also given in Mouret et al.,20 for multistage as well as crude-oil operations scheduling problems. Approaches other than mathematical programming have been reported in the literature. He and Hui21,22 have used genetic algorithms for single-stage batch problems with sequencedependent changeovers in examples up to 100 lots and 8 parallel units. Gravel et al.23 used ant colony optimization to obtain feasible schedules for an aluminum casting plant, while Engell and co-workers24,25 employed timed automata models. Constraint programming (CP) is another valid alternative, particularly in problems dealing with makespan minimization and when coupled with a problem specific search strategy.2628 In turn, Bhushan and Karimi29 used simulated annealing and heuristic algorithms to address the problem considered in this work, involving an automated wet-etching station from semiconductor manufacturing. Combining the strengths of conceptually different methods is unfortunately far less common. Perhaps the best known success case for scheduling problems concerns hybrid MILP/ CP algorithms. For single-stage problems, Jain and Grossmann30 acknowledged the two decision levels involved—lot-unit assignment and lot sequencing on every unit—and proposed a relaxed (without the sequencing constraints) MILP master problem to determine the potential optimal assignments, which was followed by the solution of CP feasibility subproblems to check if the assignments could indeed be sequenced on every unit. Integer cuts were then added to the relaxed problem, to avoid infeasible solutions in subsequent iterations, and orders-ofmagnitude savings in computational time were observed when compared to standalone MILP and CP approaches. Harjunkoski and Grossmann16 extended the algorithm to multistage problems, only to discover that the straightforward-to-derive integer cuts were much weaker and that more-efficient cuts were not rigorous, in the sense that the optimal solution could be removed from the feasible space. It was thus no longer possible to solve sequencing problems separately, since the stages and machines are connected. Later, Maravelias and Grossmann31 had some success for the more-complex multipurpose plant structure, using a slot-based model instead of a sequencing-variables continuous-time model. All the proposed cuts were rigorous,

ARTICLE

Figure 1. Scheduling algorithm combining mathematical programming and discrete-event simulation.

and the results for some classes of problems showed that the hybrid algorithm was orders of magnitude faster than the standalone MILP model. In this paper, we propose combining mathematical programming and discrete-event simulation for the large-scale solution of a class of multistage multiproduct scheduling problems arising in semiconductor manufacturing. Previous works addressing this problem3234 have reported that the complexity lies with the sequencing of the transportation tasks that are performed by a single robot and proposed two-stage algorithms to reduce the number of solutions involved and, hence, allow the solution of larger problems. Unlimited robot availability was assumed in the first step and a relaxed MILP model was solved to determine the optimal processing sequence. This was then fixed in the second step, and a constrained MILP model was solved to determine the optimal sequence of transportation tasks. Overall, only problems featuring an extra few orders could be still be tackled, showing that other approaches were needed for sizes beyond 12 lots in 12 units.32 Of the three conceptually different continuous-time models, the one by Castro et al.32 exhibited the best performance. Therefore, it is the one that is used in this work and is described in section 3, following problem definition in section 2. In the novel algorithm, the concern no longer involves determining the optimal solution but rather very good solutions in a reasonable amount of time (1 h). The decomposition strategy is somewhat related but now involves three stages (see Figure 1). In the first, we find a near-optimal sequence under unlimited robot availability by applying neighborhood search procedures3537 to an initial sequence, using the relaxed MILP model. Such rescheduling procedures use the ideas from our previous work,1315 where a key aspect is the presence of the number of lots per iteration as a tunable parameter to balance 10666

dx.doi.org/10.1021/ie200841a |Ind. Eng. Chem. Res. 2011, 50, 10665–10680

Industrial & Engineering Chemistry Research

Figure 2. Schematic of an automated wet-etching station (AWS).

solution quality and total computational effort. Now, however, lot selection is done randomly, instead of through the use of preordering heuristics. The best found sequence is then the input to the discrete-event simulation model in stage 2, which is able to generate a good feasible solution to the one robot problem. This will be the starting point of the rescheduling procedure applied to the more-complex MILP, featuring the robot sequencing constraints. In stage 3, multiple runs are performed using the same settings to generate statistical data. The actions that can be performed in terms of domain reduction, with respect to the full-space models, are discussed in section 4, while section 5 focuses on the description of the discrete-event simulation model. The algorithm overview is then the subject of section 6, with section 7 briefly describing an integrated constraint programming approach that has solved the problem at hand. This is featured in the computational results (section 8), while the conclusions are presented in section 9.

2. PROBLEM DEFINITION Consider the automated wet-etching station (AWS) in Figure 2, where a set I of wafer lots must go through a series of chemical and water baths from the input to the output buffer. A total of M units must be considered, with the last unit being the output buffer. The transportation system for moving lots from one bath to the next is assumed to consist of a single robot that can carry one lot at a time and cannot be used as a temporary buffer (i.e., the given transportation times to unit m (πm) must be strictly met). Similarly, the holding time in chemical baths must be equal to the given processing time (pi,m), which is known as a zero-wait (ZW) policy. In contrast, overexposure to water beyond the processing time is safe, so the water baths can act as local storage (LS). The objective will be to determine the sequencing of lots at the baths and also at the robot, to minimize the makespan. 3. FULL-SPACE MILP MODELS Castro et al.32 proposed a new MILP model that relies on the concepts of time slots for the processing tasks and sequencing variables for the transportation tasks. Since it exhibited a better performance than the closely related model by Bhushan and Karimi,33 and the pure sequencing variables model by Aguirre et al.,34 it is the underlying model in the mathematical programming part of the proposed algorithm. To keep the complexity at a manageable level and allow the solution of large-scale problems in reasonable time, several iterations will be involved, each

ARTICLE

dealing with a highly constrained version of the full-space model given next, generated by fixing a subset of the binary variables. 3.1. Relaxed One-Robot Model (R-ORM). A relaxed version of the one-robot model is obtained by neglecting the robot sequencing constraints, i.e., assuming that the robot is always available to perform the transfers. This allows for a major reduction in the number of variables and constraints, making the problem much easier to solve. Indeed, the so-called “R-ORM”32 has been the basis of two-stage heuristic procedures3234 that first determine the optimal sequencing of processing tasks and then find the optimal sequence of the transportation tasks for such a processing sequence. The size of the tractable problems can thus be slightly increased at a small penalty in solution quality (from using a decomposition strategy, instead of a full-space approach). Consider |M| time grids with t ∈ T slots each, where |T| = |I|. Given the fact that plant topology and storage constraints prevent lot sequencing from changing from one unit m to the next, the binary variables Ni,t, simply assign lot i to slot t. The starting time of slot t in the grid linked to unit m is given by Tt,m, while Tet,m gives the end of processing in water bath units (mˇZW) and MS defines the makespan. Equation 1 defines the objective function, makespan minimization. ð1Þ

min MS

Equations 2 and 3 enforce that a slot holds exactly one wafer lot.



Ni, t ¼ 1

"i∈I

ð2Þ

∑ Ni, t ¼ 1

"t∈T

ð3Þ

t∈T

i∈I

The duration of slot t in chemical bath unit m must be greater than the processing time of the lot assigned to the slot plus transfer times to unit m and m + 1, since the next lot in the sequence can only be transferred from unit m  1 to unit m after i arrives at m + 1 (eq 4). Ttþ1, m g Tt, m þ

∑ Ni, t pi, m þ πmþ1 þ πm

i∈I

" m ∈ ZW, t ∈ T, t 6¼ jTj

ð4Þ

The same applies to eq 5; however, in this case, the starting time of slot t + 1 in water bath unit m is related to the ending time of the previous slot in the same unit. Ttþ1, m g Tet, m þ πmþ1 þ πm

ð5Þ

" m ∈ LS, t ∈ T, t 6¼ jTj

Equation 6 states that the starting time in water bath unit m + 1 must equal the starting time of the same slot (handling the same lot) in the previous chemical unit plus its processing time in m and transfer time to m + 1. Tt, mþ1 ¼ Tt, m þ

∑ Ni, t pi, m þ πmþ1

i∈I

" m ∈ ZW, t ∈ T

ð6Þ

In contrast, the end of processing in water bath units may go beyond the lot’s processing time (eq 7). Tet, m g Tt, m þ 10667

∑ Ni, t pi, m

i∈I

" m ∈ LS, t ∈ T

ð7Þ

dx.doi.org/10.1021/ie200841a |Ind. Eng. Chem. Res. 2011, 50, 10665–10680

Industrial & Engineering Chemistry Research

ARTICLE

Note that, while the combination of eqs 5 and 7 is equal to 4, we need to separate the ZW and LS cases and define the variable Tet,m for the latter, to ensure that longer tasks occur during processing and not during transportation. This is apparent from eq 8, where the ending time of slot t in unit m plus transfer to unit m + 1, is equal to the starting time of the same slot in unit m + 1. Tt, mþ1 ¼ Tet, m þ πmþ1

" m ∈ LS, t ∈ T

necessarily equal to 1 (YD1); or (iii) equal to 0 (YE0) (through eqs 1517). YE0 ¼ fðt, m, t 0 , m0 Þ : t, t 0 ∈ T, t 0 > t, m, m0 ∈ M, m 6¼ m0 , UBm, t e LBm0 , t0 g

ð8Þ

Equation 9 ensures that the starting time of the first slot linked to the first chemical bath is greater than the transfer time from the input buffer. T1, 1 g π1

ð9Þ

Since there is no processing time in the output buffer (pi,|M| = 0), the starting time of its last slot defines the makespan (eq 10). TjTj, jMj e MS

ð10Þ

Finally, eq 11 makes the formulation more efficient by relating the makespan to the starting, processing, and transfers times of the active lot in/to subsequent units, for every bath unit m and slot t. Notice that model R-ORM involves no big-M constraints. Tt, m þ

∑ Ni, t

i∈I



" m ∈ M, m 6¼ jMj, t ∈ T

ð12Þ

Let us assume that LBm,t and UBm,t give the lowest/highest possible interval for transferring the lot in position t (concerning the bath sequence) to unit m, which can be calculated by eqs 13 and 14, on a hypothetical robot grid.32 LBm, t ¼ t0 0



∈T t et

0



1

" m ∈ M, t ∈ T

ð13Þ

m ∈M m0 e m þ t  t 0

UBm, t ¼ jIj 3 jMj þ 1  0



0



t ∈T

m ∈M

t0 g t

m0 g m þ t  t 0

1 ð14Þ

" m ∈ M, t ∈ T Note that these include nested sums of a constant, 1. We can then determine the subsets that represent the cases where variables Y t,m,t0 ,m0 are (i) not necessarily equal to 0 (YD0), (ii) not

YD1 ¼ fðt, m, t 0 , m0 Þ : t, t 0 ∈ T, t 0 > t, m, m0 ∈ M, m 6¼ m0 , UBm0 , t 0 > LBm, t g

ð17Þ

Tt, m g Tt0 , m0 þ πm  H 3 ð1  Y̅ t, m, t 0 , m0 Þ

ð18Þ

Tt0 , m0 g Tt, m þ πm0  ðH 3 Y̅ t, m, t0 , m0 Þjðt, m, t0 , m0 Þ ∈ YD0

ð19Þ

" ðt, m, t 0 , m0 Þ ∈ YD1

32

" t, t 0 ∈ T, t 0 > t, m, m0 ∈ M, m 6¼ m0

ð16Þ

" ðt, m, t 0 , m0 Þ ∈ YD0

ð11Þ

3.2. One-Robot Model (ORM). The most efficient way to model the transportation tasks is through general precedence sequencing variables3 that relate different unit-slot pairs, such as that in the work of Bhushan and Karimi.33 This allows for a significant reduction in the domain of the binary variables and sequencing constraints right from the start, whereas, in the lot-unit sequencing variables model by Aguirre et al.,34 we first must find the processing sequence before applying conceptually similar reducing space constraints. The exact definition of the binary sequencing variables is given in eq 12, where not all possible combinations must be considered. ( 1 if slot t in unit m starts after slot t 0 in m0 Y̅ t, m, t0 , m0 ¼ 0 otherwise

YD0 ¼ fðt, m, t 0 , m0 Þ : t, t 0 ∈ T, t 0 > t, m, m0 ∈ M, m 6¼ m0 , UBm, t > LBm0 , t 0 g

In particular, if UBm,t e LBm0 ,t0 , then Tt,m < Tt0 ,m0 and we know that Y t,m,t0 ,m0 = 0, so such variables can be removed from the formulation. When compared to previous work,32 we have found that the subset of variables that are equal to 1 is an empty set, and that YD1 = YD0 ∪ YE0. The big-M sequencing constraints that prevent the simultaneous transfer of lots by the robot are given by eqs 18 and 19, where the big-M is the time horizon H, which is an upper bound on the makespan.

ðpi, m0 þ πm0 þ1 Þ e MS

m0 ∈ M m0 g m ∧ m0 6¼ jMj

ð15Þ

3.3. Remarks. The models just shown correspond to the single shared resource special case of the unlimited and multiple robot models given in the work of Castro et al.32 The reason why we have considered the single-robot scenario for the extensive computational studies is that feasible solutions are more difficult to find, as previous results32 have shown. The AWS model can be made to interact with the upstream and downstream parts of the process through the addition of release and due dates constraints, which are relatively straightforward to implement, as can be seen for a closely related problem.15 Other possible upgrade concerns the incorporation of the travel times of the empty robot that, in reality, are a function of initial and final locations. This can be seen as a sequence-dependent changeover task requiring the robot resource, which can be considerably more difficult to implement.

4. EFFECTIVE RESCHEDULING THROUGH NEIGHBORHOOD SEARCH Because of the rapid increase of combinatorial complexity, fullspace models are only useful up to a certain problem size, which is usually well below the size encountered in real-world applications. However, one can develop systematic decomposition strategies that, by solving highly constrained versions of the model, allow the number of decisions to be maintained at a reasonable level, even for large-scale problems. More specifically, we will fix the value of most lot-slot binary assignment variables and binary sequencing variables, starting from an initial schedule. This generic search procedure has been called neighborhood search,35,36 solution polishing,37 and local branching.38 Neighborhood search methods have been quite successful in logistics problems.39,40 They have been classified35 as a new metaheuristic, where the main element, the local search routine, 10668

dx.doi.org/10.1021/ie200841a |Ind. Eng. Chem. Res. 2011, 50, 10665–10680

Industrial & Engineering Chemistry Research

ARTICLE

Figure 3. Illustration of the rescheduling algorithm for model R-ORM and three lots at a time (NOS = 3).

can range from a simple heuristic to a heavyweight technique, such as constraint programming,40 or to a highly constrained version of the full-space MILP model, such as that in this work. Although much fewer moves per second can be evaluated with the latter, they can be substantially more powerful, moving the search further at each step. This aspect becomes particularly important when in the presence of real-world problems with many constraints, where most simple move operations will be illegal, because of the violation of such constraints. Indeed, effectively addressing these side constraints has been identified by Shaw40 as being the challenging part of the algorithm. In our proposed algorithms, this is related to the selection of the binary variables to free for the generation of the constrained MILP model, which is done using knowledge about the problem structure. The following algorithms are heavily based on the work by Castro et al.,14 despite the fact that the multistage problem under consideration is slightly different. Although the zero-wait policies in the chemical bath units and the existence of the shared transportation device complicate the problem, the absence of parallel processing units eliminates the need to postulate more time slots than those required to represent the schedule (it is now equal to the number of wafer lots, |I|, in all iterations). The selection of lots to reschedule on a particular iteration is now made randomly. 4.1. Model R-ORM. If robot availability is neglected, the number of binary variables from model R-ORM (|I|2) remains in the hundreds, even for the largest size under consideration, featuring 25 lots. The associated decisions involve the assignment of every lot to a single time slot in the grid, which implicitly define lot sequencing (the same in every unit). In other words, all binary variables can be fixed, given the lot sequencing, and the corresponding optimal schedule can be found almost instantaneously by solving a linear program. Since this is just the first step in the search for the best schedule for the one robot problem and the initial schedule can significantly be improved within less than a minute of computational time, we start with the lexicographic sequence (see Figure 3). The next decision is the number of lots to reschedule per iteration (parameter NOS). The higher the number, the closer to the full-space model, meaning potentially better solutions and higher computational time. Let J be the set of iterations and Ij the subset of lots to be rescheduled in iteration j. In Figure 3, the chosen lots for iteration 1 are I2, I3, and I5 (NOS = 3), which are

Figure 4. Rescheduling algorithm for model R-ORM.

highlighted by a dashed outline within the boundaries of j = 0. Such lots will be allowed to change positions among themselves while the others remain allocated to the same slot. In this case, I3 moved to slot t = 2, I5 to t = 3, and I2 to t = 5. The algorithm then decides to pick I1, I3, and I6 for iteration j = 2 and the outcome is I1 and I6 changing places, with I3 remaining in slot t = 2. Clearly, a minimum value of NOS = 2 is required for an improvement to occur, whereas, in problems with parallel units,14,15 one can find a better schedule already for NOS = 1. The rescheduling algorithm continues up to the chosen number of iterations |J|, which leads to an I2I3I6I1I5I4 sequence. The detailed algorithm is given in Figure 4. The first aspect worth mentioning is the use of a proper seed for random number generation, which can be done based on the milliseconds of the CPU clock at the starting time. This will ensure different outcomes from the execution of the algorithm. Since we are looking for a positive integer value in the interval [1, |I|], we round up the outcome of a uniform distribution between 0 and |I| (check the “selecting lots to reschedule” box in Figure 4). The lots with the same index of the NOS random numbers linked to iteration j will be the elements of set Ij. Naturally, it becomes more likely to pick the same number more than once for a larger NOS value, leading to more iterations with |Ij| < NOS. 10669

dx.doi.org/10.1021/ie200841a |Ind. Eng. Chem. Res. 2011, 50, 10665–10680

Industrial & Engineering Chemistry Research

ARTICLE

Figure 5. Domain of transportation tasks for a simple example featuring |I| = 3 and |M| = 3.

Figure 6. Illustration of the rescheduling algorithm for model ORM and two lots at a time (NOS = 2).

The next step is to identify the slots that can get a new tenant, subset Tj, which are the ones housing lots i∈Ij (known from the current level of the binary variables, Ni,t.l). Such lots can freely be assigned to any of such slots leading to the definition of set Ifree t,j . On the other hand, lots not selected for rescheduling will be fixed fixed . These latter sets will then be used to their current positions, It,j to set the lower (attribute lo) and upper bounds (attribute up) of the binary variables: (i) no domain reduction is possible for (i, t) pairs with i∈Ij and t∈Tj (e.g., NI1,t=1,2,6.lo = 0, NI1,t=1,2,6.up = 1, for j = 2; see Figure 3); (ii) nonselected lots in their current slots are fixed to one (e.g., NI5,t=3.lo = NI5,t=3.up = 1); (iii) the remaining combinations are eliminated from the model by making the bounds equal to zero (e.g., NI5,t=1,2,4,5,6.lo = NI5,t=1,2,4,5,6.up = 0, NI1,t=35.lo = NI1,t=35.up = 0, for j = 2). The highly constrained model R-ORM can then be solved to optimality to determine if a better solution can be found. Note that the current best solution has not been removed from the feasible space, so the resulting problem is always feasible. It is also possible for the assignments to change even if no improvement is observed in the objective function, provided that there are degenerate solutions. The iterative procedure continues until the last iteration. It is naturally straightforward to implement other termination criteria, such as total computational time or number of iterations without improvement. Furthermore, a better seed selection mechanism can be introduced by amending a tabu list to avoid cycling over repetitive seeds. However, one should remove previous seeds from the list every time a new solution is found, since this outcome will change the MILP problem being solved. In other words, the same seed can lead to further improvements in subsequent iterations, as was observed experimentally. 4.2. Model ORM. With the addition of the sequencing constraints for the robot, the rescheduling algorithm becomes more complex, since it is necessary to release the binary sequencing variables associated to the chosen lots to reschedule, so that

the schedule can be improved. Again, we start from an initial schedule, but now, because of the zero-wait policy at the chemical baths and the strict transfer times by the robot, we can no longer easily generate a feasible schedule, given the processing sequence at the baths. We will have to rely on the discrete-events simulation model to provide the initial schedule from which the sequence of transportation tasks executed by the robot can be extracted. Notice that a total of |I| 3 |M| transportation tasks are involved. Consider a hypothetical robot time grid to keep track of the transportation tasks. This is illustrated in Figure 5 for a simple example consisting of three lots and two baths (plus the output buffer) that also gives the necessary information to generate the domain of the binary sequencing variables.32 Notice that the domain is quite narrow, since the uncertainty just occurs for slots 3, 4, 6, and 7 of the robot grid. Thus, only variables Y 1,3,2,1 and Y 2,3,3,1 cannot be fixed a priori (YD0 = {(1,2,3,1),(2,3,3,1)}); in Figure 5, the choices leading to a value of 1 are shown within dashed rectangles. For the same example, assume that, in the initial schedule, lots go through the processing units according to the lexicographic sequence, while the transportation sequence is given in Figure 6. Note that, contrary to the choice highlighted in Figure 5, the transfer task of the lot processed in slot t = 2 (I2) to unit M3 now occurs before the transfer linked to slot t = 3 (I3) and unit M1. Supposing that the chosen lots for rescheduling in the first iteration are I1 and I2 (NOS = 2), the constrained ORM model will feature 6 binary variables, as opposed to the 11 of the full-space model. Moreover, this reduction is exclusively due to binary variables Ni,t since the two sequencing variables Y t,m,t0 ,m0 mentioned earlier cannot be fixed. More specifically, we allow for I1 and I2 to occupy either slots t = 1 or t = 2 of the processing unit’s grids (not the robot grid), leading to 22 = 4 instead of the 32 = 9 possibilities for the 3 lots in the fullspace model. 10670

dx.doi.org/10.1021/ie200841a |Ind. Eng. Chem. Res. 2011, 50, 10665–10680

Industrial & Engineering Chemistry Research It is important to emphasize that the transportation tasks of fixed lots can still change positions in the robot grid, provided that one of the lots to be rescheduled is immediately before or after in the current processing sequence. This is always the case for this simple example, meaning that none of the sequencing variables belonging to domain YD0 can be fixed. However, for growing |I|/NOS values, it will be possible to fix the values of more and more such variables to their current ones, thus helping to keep combinatorial complexity at a manageable level. Returning to the example, transportation tasks (I3,M2) and (I3,M3) cannot be moved (see the solid outline in the j = 0 row of Figure 6) but (I3,M1) will be allowed to move (dashed outline in Figure 6), so that either task (I2,M3) or (I1,M3) can be assigned to robot slot 7, which may lead to a better schedule. Indeed, the outcome of iteration j = 1 is a change of sequence from I1I2I3 to I2I1I3, with (I3,M1) moving to robot slot 6 (Y 2,3,3,1 switched values from 0 to 1). Notice that all seven transportation tasks that could change places did so (shown in color in Figure 6). Lots I2 and I3 are picked for iteration j = 2, meaning that it is possible to change the robot grid assignments of all transportation tasks but (I1,M2) (recall Figure 5). Again, there is a change in the processing sequence, to I3I1I2, with none of the I1 transportation tasks changing places, with respect to the robot grid (Y 1,3,2,1 = Y 2,3,3,1 = 1). The detailed rescheduling algorithm for the one-robot model (ORM) is given in Figure 7. The actions related to binary variables Ni,t are taken from Figure 4 and therefore has the “choose lots to reschedule” routine. We thus focus the discussion on the domain of the binary sequencing variables. Given are the position of lot i in the processing sequence (posi) and the assignments in terms of the slot of the hypothetical robot grid where the transfer to unit m of the lot being processed in slot t of the bath time grid occurs (slott,m). Binary variables Y t,m,t0 ,m0 that can take values different than zero, (t,m,t0 ,m0 )∈YD0, are fixed to 1 if slott,m > slott0 m0 or 0, otherwise. Then, according to eqs 18 and 19, all binary sequencing variables have been fixed and we can solve the LP resulting from model ORM. At the start of every iteration, all sequencing variables of the full-space model are reset to zero. After selecting the lots to reschedule in iteration j (Ij), the set of processing unit slots that can get a new lot can be generated (Tj). Binary variables Y t,m,t0 ,m0 that can take values different than zero become the members of 0 subset Yfree j , provided that either t or t ∈Tj. Otherwise, and if their current values are equal to 1, indices (t,m,t0 ,m0 ) become members . These two subsets are used in the next block of the subset Yfixed j to make some variables free {0,1}, while the others are fixed to 1, respectively. The highly constrained MILP resulting from model ORM is then solved iteratively until the last iteration, from which the final schedule results.

5. DISCRETE-EVENT SIMULATION MODEL Nowadays, modern discrete-event simulation software has become a very attractive and powerful tool for modeling, analyzing, and evaluating the impact of different solution strategies on industrial-scale scheduling problems. A major advantage of this computer-aided methodology is that it allows complex manufacturing processes to be represented in an abstract and integrated way, visualizing the dynamic behavior of its constitutive elements over time (see the work of Banks et al.41). The proposed simulation model developed here systematically reproduces the complex dynamic behavior of AWS, represented by

ARTICLE

Figure 7. Rescheduling algorithm for model ORM.

a sequence of successive chemical and water baths and a unique resource for automated transfer of jobs in the system. The principal components and tools available in the Arena simulation environment are used to achieve the best representation of this highly constrained manufacturing system. Also, advanced templates and blocks are utilized for modeling specific operation features arising in this process. It is worth highlighting that the Arena environment has been extensively used for developing effective industrial and manufacturing applications recently.4245 The Arena discrete-event simulation software provides a complete set of tools embedded in one of the most flexible, reliable, and well-known modern simulation environments. Arena helps to easily demonstrate, predict, and measure alternative system strategies for effective, efficient, and optimized manufacturing process performance. However, note that the proposed solution strategy does not rely on the use of the Arena package: other alternative commercial simulation environments also may be used to implement the proposed methodology. Some of them include Simio, Witness, Simul8, ProModel, etc. The proposed model provides an easy way to represent the entire AWS production process by utilizing specific submodels. Each submodel includes a set of operative rules and strategic decisions, which are modeled using basic and advanced blocks from the Arena Simulation Tool. In addition, visual monitoring objects and animated screens are created to measure the 10671

dx.doi.org/10.1021/ie200841a |Ind. Eng. Chem. Res. 2011, 50, 10665–10680

Industrial & Engineering Chemistry Research

ARTICLE

Figure 8. In-progress animation of the AWS station.

Figure 9. Information exchange between Excel, Arena, and MILP software.

utilization performance of baths and resources in the system. Figure 8 shows an in-progress animation screenshot of the AWS station generated in the Arena simulation environment. Another essential feature is that the model allows one to work with an user-friendly interface, Microsoft Excel supporting Visual Basic for Applications (VBA), to simultaneously read and write production data. Using the input data of the initial solution, which is provided by the MILP model, the Arena simulation software can simultaneously read the Excel spreadsheet and run the process model to generate many important statistics that are collected by Excel files as output data. Such information, which is composed of the starting time and finishing time of jobs in different baths, is dynamically updated by the model over time, providing the detailed schedule of jobs and the activity program of the robot. A simple scheme illustrating the information exchange between Microsoft Excel, Arena Simulation Software, and the MILP-based methodology is shown in Figure 9. The proposed simulation model is divided into four main modules: input, transfer, process, and output. The first module, here called the input submodel, receives the information describing processing times pi,m and transfer times πm of every job i in every bath m as input data. Also, the initial solution of the job sequence provided by MILP model is automatically read by the discrete-event simulation model in order to generate the set of entities (job entities and transfer entities) to be scheduled in the system. Here, the advanced internal logic for every transfer entity is determined in order to generate a feasible schedule of robot activities. The next module is the transfer submodel, which simulates the time needed to transfer jobs (i = 1, ..., |I|) between successive baths (m = 1, ..., |M|  1) and from/to input/output

(m = |M|) buffers. Here, it should be noted that transfers can only be performed if the robot and the destination bath are both available at the same time. The process module is defined for every bath. The internal logic of each submodel depends on the type of bath that is scheduled, i.e., chemical baths m = 1, 3, 5, ..., |M|  2 and water baths m = 2, 4, 6, ..., |M|  1 are managed according to their particular restrictions. For water baths, the process must ensure only a minimum processing time, since a holding time is permitted if the robot or the next chemical bath are busy. For chemical baths, the residence time must be controlled strictly. In this case, the internal logic must guarantee that the transfer to the following water bath is always performed immediately after the chemical bath ends, which requires that both the robot and the destination bath must be available at this time. Both process modules save the time at which chemical and water processes begin and end in an Excel spreadsheet, updating in a dynamic way the entire schedule of the system (see Figure 10). The last module is the output submodel. This module represents the final stage of the AWS station. The output buffer is the place where all entities created in the input module exit the AWS system. At this point, the simulation model reports the final processing time (Makespan) and the current status of every job in the system. The principal objective of modeling the internal robot logic is to explicitly coordinate and efficiently synchronize the use of shared transportation resources for transferring jobs between consecutive baths. For this, the pair of transfers (i, m) and (i0 , m0 ) related to particular jobs i and i0 , in respective baths m and m0 , can never be overlapped in the robot. Thus, the sequencing problem of transfers in a single robot adopts major relevance to produce a feasible schedule for the entire system. The advanced internal logic for the robot is embedded into the model to compare and update the start time tsi,m and the end time tei,m of every transfer (i, m) in the system. The main goal here is to define the earliest time at which each transfer (i, m) can be executed. This procedure is based on the previous ideas of the JAT (job-at-a-time) algorithm developed by Bhushan and Karimi.29 Our algorithm, similar to the JAT algorithm, selects a job i to be processed and then generates and also evaluates all of its transfers (for all baths m = 1, ..., |M|  1) that will be inserted in the system. The main difference between 10672

dx.doi.org/10.1021/ie200841a |Ind. Eng. Chem. Res. 2011, 50, 10665–10680

Industrial & Engineering Chemistry Research

ARTICLE

Figure 10. Dynamic Gantt Chart schedule generated by a user-friendly Excel interface.

those algorithms arises in the transfer evaluation process. In our model, every selected transfer (i, m) is compared with all transfers (i0 , m0 ) previously inserted in the system to define a detailed feasible schedule of robot operations. For that, it is necessary to define a set of attributes [tsi,m; tei,m] and [tsi,m+1; tei,m+1], for each transfer (i, m) in order to determinate a proper sequence of transfers over time, avoiding infeasible solutions for the future transfers of the same job i, e.g., transfer (i, m + 1). For this, we define w as the position of job i in the job sequence. That means that a job processed in the wth position will be always assigned before a job processed in position w + 1. The simplified logic for the robot allocation is summarized in Figure 11.

6. ALGORITHM OVERVIEW Having presented the elementary blocks of the proposed approach, we now summarize how they interact. The information required from the user concerns the specification of the number of iterations in the neighborhood search procedures and the number of lots to be rescheduled per iteration (shown on the left of Figure 1). The two arrows emphasize that different settings can be applied to the procedures based on models R-ORM and ORM, with the former being less sensitive to data (i.e., for the same NOS value, an iteration from R-ORM requires considerably less time than one from ORM). The neighborhood search algorithms have been implemented in GAMS and essentially call the MILP optimization models several times for different settings regarding the domain of the binary variables. The one based on model R-ORM finds the best processing sequence of lots in bath units, neglecting robot availability constraints and starting from the lexicographic sequence (I1I2I3 3 3 3 ). At the end of the last iteration, the makespan has hopefully been reduced, resulting in a different sequence that is passed on through parameter posi (position of lot i in the processing sequence). This is the only input required by the discrete-event simulation model implemented in Arena. Without changing the processing sequence, the simulation model takes into consideration robot availability and finds a feasible schedule to the problem, featuring a higher makespan. Then, starting from the information concerning the sequence of transfer tasks, included in parameter slott,m (position of the

transfer task to unit m associated to the lot executed in slot t), the ORM-based neighborhood search procedure attempts to significantly improve the makespan. The output of the algorithm is the best found schedule for the one robot model. 6.1. Remarks. The neighborhood search procedure applied to R-ORM is the least important part of the algorithm. On one hand, the solution to this problem will not yield a feasible solution to the original problem, meaning that significant rework may be required in the second step. On the other hand, one can simply use any processing sequence or, better yet, the sequence from the best found solution from the full-space ORM model up to, for example, 60 CPU s of computational time. The latter option is particularly suitable when able to obtain good-quality solutions, information known from the optimality gap or from some other method (e.g., neighborhood search over R-ORM). Neighborhood search has the advantage of being more robust, with respect to an increase in problem size. The use of the discrete-event simulator in stage 2 can be replaced by a constructive scheduling procedure1215 conceptually similar to the rescheduling procedure that is part of the third step. In such case, the first feasible solution to the one-robot problem would be generated by building the schedule from scratch, adding NOS lots one at a time until the generation of the complete schedule in the last iteration, which would be equivalent to any iteration of the ORM neighborhood search procedure.

7. INTEGRATED CONSTRAINT PROGRAMMING APPROACH (CP) Zeballos et al.26 proposed an integrated CP approach that consisted of two parts: a CP model and an efficient domainspecific search strategy in order to handle the different characteristics of the automated wet-etching problem. The model employs some high-level constructs available in the modeling language ILOG OPL Studio, which were specifically developed to address scheduling problems. These constructs are “precedes”, “requires”, and “activityHasSelectedResource”. While the former imposes a sequence of activities without overlapping, the second sets the allocation of renewable resources demanded by activities. Finally, the last one is the most important construct, which assumes a value equal to 1 when an activity has been assigned to a 10673

dx.doi.org/10.1021/ie200841a |Ind. Eng. Chem. Res. 2011, 50, 10665–10680

Industrial & Engineering Chemistry Research

ARTICLE

Figure 11. Algorithm for internal robot logic in discrete-event simulation model.

specific resource belonging to a given set of alternative resources or 0, otherwise. The exploration algorithm, called Guided Variable Domain Reduction (GVDR), is a key aspect given the high combinatorial complexity of the problem. The search strategy goal is to avoid early bad choices during the exploration process and generate high-quality feasible schedules with low computational effort. It is important to note that the GVDR strategy had a major impact on computational performance when compared to the default search strategies, which often failed to find even a feasible solution. The GVDR search strategy improves the computational performance by resorting to a well-adjusted allocation process of the transportation activities of wafer lots to robots and a variable domain reduction procedure associated with processing tasks. For both procedures, which establish the way the search space is explored, activities are arranged considering a decreasing order of baths. Another feature of the arrangement is that it focuses on tasks corresponding to lots that seem to be more demanding in a given time period, in terms of transportation resources.

8. COMPUTATIONAL RESULTS The performance of the proposed algorithm is now evaluated through the solution of 11 test problems built from the data given by Bhushan and Karimi.29 These range from 18 lots in 4 baths to the 25 lots in 12 baths of the original problem (called P11). The problem data are given in Table 1, with the lots and units to consider being the first according to problem specification. The rescheduling algorithms and MILP models were implemented in GAMS 23.5 with the latter being solved by the MILP solver CPLEX 12.2, using the two threads of the Intel Core2 Duo T9300 2.5 GHz laptop (Windows Vista Enterprise, 4 GB of RAM). The relative optimality gap was set to 106, and the maximum computational time per iteration was set to 3600 CPU s. The remaining settings were the defaults. The discrete-event simulation model was implemented in Arena 12.0 and solved on a Pentium 4, 1.8 GHz machine. The integrated CP approach was implemented in ILOG OPL Studio 3.7 and solved on an AMD Athlon 64 X2 Dual Core 2.2 GHz processor up to the same maximum computational limit of 3600 CPU s. 10674

dx.doi.org/10.1021/ie200841a |Ind. Eng. Chem. Res. 2011, 50, 10665–10680

Industrial & Engineering Chemistry Research

ARTICLE

Table 1. Processing and Transfer Times for Problems P1P11 Unit lot

1

2

3

4

5

6

7

8

1

4.3 6.7 11.3 6.3

2.5 6.9

2

5.8 6.7

8.2 6.5

4.9 6.5 12.8 6.8 10.4 6.7 11.8 6.7

3 4

10.6 6.7 2.7 6.9

2.6 6.4 6.9 7.6

2.7 7.3 13.0 6.6 11.4 6.8 3.5 7.4 3.9 6.6 7.2 6.7

8.1 7.5

9

10

4.2 7.1

11

12 13

3.9 6.8 9.2 6.6 3.9 6.8

5

4.1 6.7 11.0 6.8

7.4 6.2

3.1 6.3

3.7 6.2

9.4 6.9

6

3.7 6.9

6.5 6.6

2.5 6.6

2.6 6.5

2.7 6.3

7

10.5 6.7

3.7 6.6 11.9 6.6

2.6 6.2

6.9 6.5

3.9 6.8

8

3.9 6.8

6.6 6.4

3.3 6.9

3.4 6.4 11.3 6.7

5.8 7.5

9

2.5 7.5

1.4 7.6

6.6 6.8 11.0 6.9 12.9 6.5

5.2 7.8

2.5 6.4

10 10.8 6.7 10.1 6.5

2.5 6.6

2.7 7.1

4.6 6.5 11.4 6.3

11 12

8.7 6.2 7.0 6.3

4.2 7.2 7.2 6.6

6.1 6.2 2.7 6.7

5.9 6.5 8.9 7.1

4.6 6.7 2.9 6.7

8.8 6.6 6.4 6.8

13

9.1 6.8

2.8 6.4

5.9 6.4

5.9 6.9 10.4 6.9

8.8 6.5

14

2.7 6.1 11.4 6.9

7.7 6.4

5.1 6.2

4.7 6.9 10.0 6.8

15

2.8 6.8

6.8 6.3

4.2 6.7

8.5 6.6

5.7 6.5

4.3 6.9

16

5.7 6.9

2.8 7.1

4.7 6.1

3.9 6.9

4.4 6.4

2.7 6.3

17

2.5 7.6

6.7 6.5

2.6 6.4

3.4 7.2

2.9 6.7

7.8 6.4

18

3.9 6.8 12.1 6.8

2.7 6.3

9.3 6.2

4.7 6.3

2.6 6.8

19 20

9.7 6.7 2.6 6.7

7.6 6.4 10.9 6.9 2.9 6.5 10.4 6.9

21

4.7 6.6

4.9 6.9

2.6 6.8 12.7 6.2

2.6 6.7

6.9 6.4

22

2.5 6.3

2.6 6.6

7.9 6.8 12.5 6.8

2.6 6.5

7.8 6.4

23 11.4 6.4

8.9 6.6

2.7 6.4 11.4 7.4 11.3 6.8

2.9 6.9

24

6.8 6.5

2.8 7.5

3.9 7.2

25

8.8 6.9

8.8 6.8 11.3 6.8 11.3 6.1

6.7 6.5

2.6 6.4

πm

1.2 0.6

0.8 1.0

0.8 0.4

0.8 1.0 1.2

0.4 0.6

2.6 6.7 4.6 6.6 10.1 6.3 2.6 6.7 11.5 6.6 3.7 6.2

9.8 6.5

1.0 1.0

8.6 6.3 11.8 6.2

8.1. Full-Space Models. We start by analyzing the performance of the full-space models (see the results in Table 2). There are only two problems that can be solved using the relaxed onerobot model (R-ORM) to optimality under the 1-h mark or before the solver runs out of memory (P1 and P9). As the number of lots and units increases, so does the problem size and computational complexity, which is reflected in an increase in the integrality gap at the time of termination. While good solutions can still be found with R-ORM (the optimality gap, at the most, is equal to 13%, for the largest problem, P11), the same is not true when the sequencing constraints for the robot transfers are taken into consideration (ORM). Note that only the makespans for P1P4 are within an acceptable tolerance. Very poor solutions result from P5P9, while not even a feasible solution can be found for P10 and P11. This added complexity, when compared to model R-ORM, can be explained by two factors: (i) the increase, by more than 1 order of magnitude, in the number of constrains and sometimes in the number of binary variables; (ii) the introduction of big-M constraints that are known to be inefficient. It clearly reflects the need for alternative approaches to full-space models. 8.2. Solution Polishing Applied to ORM. In problems where good solutions are difficult to find, the CPLEX MILP solver can perform solution polishing, which is a variety of branch-and-cut algorithm that works after an initial solution is available. Solution

polishing is more time-intensive than other heuristics and focuses solely on finding better solutions. It explores neighborhoods of previously found solutions by solving sub MILPs and, consequently, may not prove optimality, even if the optimal solution has indeed been found. We show the outcome from the solution polishing procedure in the last column of Table 2. We have used option polishaftertime = 60 to start the procedure after 60 CPU s, which is enough time to find a feasible solution for problems P1P8. Clearly, significantly better solutions can be obtained in the same 1-h computational limit, particularly when the number of baths is greater than or equal to 8. 8.3. Neighborhood Search Using R-ORM. While the fullspace version of R-ORM is already difficult to solve, the constrained version resulting from fixing some of the binary variables can be solved considerably faster. Indeed, if one chooses to reschedule NOS = 7 lots per iteration, the |J| = 100 iterations of the neighborhood search algorithm can be solved within a minute for all problems under evaluation (see Table 3). The resulting problems feature the same number of variables and constraints as their full-space counterparts but, at most, 49 discrete variables (less if at least two of the seven picked random numbers are the same). The total computational time is just one of the relevant performance indicators, with the other being solution quality. If we compare the results obtained to those of the full-space model until roughly the same CPU time, we find that the solutions given by the neighborhood search algorithm are better in 6 out of 11 cases, with the most complex problem (P11) being the best performer (2.85% lower makespan) and P6 being the worst performer (1.57% higher makespan). The approach of finding good-quality solutions with few computational resources has thus been validated. However, given the small difference in quality, it is fair to say that solving the full-space model instead (up to 60 CPU s) would have worked equally well. 8.4. Discrete-Event Simulation Model. The discrete-event simulation model in Arena takes into consideration the robot availability constraints to generate a feasible schedule, given the lots processing sequence. The latter information is taken from the rescheduling algorithm using R-ORM, which also provides the lower bound on the makespan that can be obtained using the Arena simulation model. The results in Table 4 show that the solutions and respective lower bounds become farther and farther apart as the number of baths increase, showing a clear degradation in solution quality, which is confirmed after applying the rescheduling procedure. The good news is that