2192
Ind. Eng. Chem. Res. 2004, 43, 2192-2208
Multi-objective Decisions on Capacity Planning and Production-Inventory Control under Uncertainty Lifei Cheng, Eswaran Subrahmanian, and Arthur W. Westerberg* Department of Chemical Engineering, Institute for Complex Engineered Systems, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213-3890
In the presence of a changing and stochastic environment, firms must appropriately choose the timing and sizes of capacity acquisition as well as the production decisions to replenish inventory, such that various conflicting goals are satisfied. This paper considers multicriteria decision making on joint capacity planning and inventory control under uncertainty. We formulate this class of problems into a multi-objective Markov decision process, which simultaneously searches for both capacity and inventory policies that optimize multiple objectives. In a previous work, we developed a multi-objective dynamic programming algorithm to solve small problems by propagating Pareto optimal solutions recursively backward in time. The numerical intractability of the rigorous algorithm for large problems necessitates approximation approaches. However, and importantly, on the basis of the optimality equations constructed in the dynamic programming framework, we are able to obtain analytical insights into the structure of optimal solutions for certain simplified problems. For example, through convexity analysis, we show that such structural policies, as the target interval capacity policy and base-stock inventory policy, are optimal for a certain class of problems. We propose using these structural results to form a basis for parametric representation of suboptimal policies for more general problems. This approach greatly simplifies the optimization to that of finding a parameter vector that defines these optimal or suboptimal policies. We propose a simulation-based optimization framework to find the parameters that optimize multiple performance measures. Next we show how to exploit a multi-objective evolutionary algorithm to find a diverse set of Pareto optimal solutions in one single run, and finally we present numerical results from an example problem that we carry from formulation to solution. 1. Introduction This paper addresses coordinated capacity and production planning in a changing and uncertainty environment, in which decision makers simultaneously search for both capacity and production decisions to satisfy multiple conflicting goals. This problem belongs to a general class of problems: multi-objective multistage decision processes under uncertainty. The solution to this class of problems is important from both strategic and operational perspectives. An appropriate formulation and solution of this class of problems ought to face the following challenges: (1) Decision makers are facing the life cycle context, in which the external environment is rapidly changing over time and involves a substantial amount of uncertainty. The internal manufacturing systems must be able to respond flexibly to the changes in the environment. (2) The decision makers must be able to determine the proper timing and sizes of capacity changes at the strategic level as well as production to replenish inventory at the operational level. They must coordinate these decisions, made at different times and levels, because of the interconnections among them. (3) Decision makers must be able to account for multiple objectives, especially in a changing and uncertain environment. For instance, decision makers would like to maximize the expected profit while minimizing * To whom correspondence should be addressed. Tel.: 412268-2344. Fax: 412-268-7139. E-mail:
[email protected].
risk exposure. The difficulty lies in the conflicts (at least partly) or incommensurability among these objectives. Our primary goal in this paper is to develop a general framework for formulating this class of problems and for devising efficient solution strategies to solve them. In this paper, we formulate this class of problems as multi-objective Markov decision processes and develop a simulation-based optimization framework that attempts to solve large-scale problems. We intend to make the following contributions in this work: (1) We formulate the problem into a Markov decision process that addresses multiple conflicting objectives, such as expected profit and expected downside risk, and incorporates different types of uncertainty including product demand and technology development. The problem formulation explicitly incorporates the coordination between capacity planning and inventory control. (2) We exploit a multi-objective dynamic programming technique to convert the multi-objective Markov decision process into a sequence of single-objective optimality equations. On the basis of convexity analysis, we obtain valuable analytical insights into the structure of optimal policies for certain classes of simplified problems, which also provide a basis for proposing candidate solutions for more general problems. (3) We develop a simulation-based optimization approach to evaluate and optimize the proposed parametrized policies and to find numerical values of the parameters that define optimal or suboptimal policies. Our approach employs a multi-objective evolutionary algorithm to find a diverse set of Pareto optimal solutions in one single run.
10.1021/ie0303404 CCC: $27.50 © 2004 American Chemical Society Published on Web 03/31/2004
Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004 2193
We have organized the remainder of this paper as follows. The next section briefly reviews previous work in capacity planning and inventory control. We discuss current methodologies for formulating and solving this type of problem. In section 3, we will start with an introduction using a small example to facilitate understanding of the characteristics of this problem type. We present a formal problem formulation as a multiobjective Markov decision process in section 4. On the basis of convexity analysis, we obtain analytical results on the structure of optimal policies for single-objective problems in section 5. We then extend the structural results to multi-objective problems by converting them into single-objective problems. In section 6, we propose a simulation-based optimization framework to find the optimal or suboptimal parametrized policies. Finally, we present numerical results obtained from the example problem in section 7 and provide concluding remarks in section 8. 2. Review of the Literature Uncertainty in such factors as product demand and technology evolution always shrouds the environment in which manufacturing firms operate. Production planning and inventory control are operational level decisions that firms must make on a day-to-day basis in response to changes in the environment. Decisions such as capacity planning and technology adoption are crucial strategic level decisions in the manufacturing industry. Uncertain demand for capacity and availability of capacity (e.g., technology development) complicate the timing and sizes of capacity decisions. Firms often make these decisions under substantial risk because of the capital intensiveness and the irreversible nature of capacity adjustments. Capacity planning and inventory control under uncertainty, as two separate topics, have been the foci of extensive research efforts and literature reports. In this section, we will briefly review some of the previous work in this vast literature on capacity planning and inventory control, respectively. We will also review a very few works that are dedicated to joint decisions on capacity planning and inventory control. Finally, we will discuss current methodologies reported in the literature for formulation and solution of this class of problems. The production and inventory control problem represents one of the earliest areas of application of decision making under uncertainty. Some of the earliest and most noteworthy research on inventory control concerns the form of the optimal policy under various assumptions about the economic parameters. For example, Scarf1 shows that the optimal policy in each period is always of the (s, S) type (i.e., replenish inventory level up to S if it is below s) for a fixed-charge linear cost structure. Porteus2 extends this result to a general concave increasing ordering cost function and shows that the optimal policy in each period is always of the (s, S) type. DeCroix and Arreola-Risa3 show that a modified base-stock policy, or order-up-to policy, is optimal for multiproduct, infinite-horizon productioninventory systems. They completely characterize the optimal policy for the homogeneous products case and generalize this policy to construct a heuristic policy for the heterogeneous products counterpart. On the basis of these structural results, other research proposes to solve inventory control problems numerically by computing the parameters in the policies
with these forms. Kapuscinski and Tayur4 study a single-product-capacitated production-inventory system with stochastic periodic demand and propose a simulation-based optimization procedure using infinitesimal perturbation analysis to aid in computing the optimal parameters. Glasserman and Tayur5 consider capacitated, multi-echelon production-inventory systems operating under base-stock policies and develop simulation-based methods for estimating the sensitivities of inventory cost with respect to the policy parameters such as base-stock levels. Fu6 derives sample path derivatives of performance measures with respect to the parameters in (s, S) inventory systems and estimates the derivatives for uses in gradient-based optimization techniques. Bashyam et al.7 consider a multiproduct periodic review system with a capacity constraint and no setup cost and propose a linear approximation to the optimal switching curve. They apply perturbation analysis in a stochastic approximation algorithm to estimate the policy parameters. Research on capacity planning under uncertainty has split into two major methodologies in operations research, namely, stochastic programming8,9 and optimal control.10,11 These two approaches, although addressing the same class of problems, formulate and solve the problems with distinctive perspectives and emphasis. For example, optimal control focuses more on analytical solutions, while stochastic programming pays more attention to numerical solutions. Solutions using optimal control are only numerically tractable when the state space is relatively small, while stochastic programming will become computationally prohibitive for a large number of scenarios. With recent theoretical and algorithmic developments in stochastic programming, there is increasing literature on capacity planning based on stochastic programming, with the use of scenarios to model uncertainties. Eppen et al.12 describe a model developed for General Motors to aid in decision making on capacity planning. The model maximizes the present value of the expected discounted cash flows subject to a constraint on the expected downside risk. Barahona et al.13 present a stochastic programming approach to capacity planning under demand uncertainty in semiconductor manufacturing. They formulate the problem as a mixed-integer program in which the expected value of the unmet demand is minimized subject to capacity and budget constraints. Ahmed and Sahinidis14 and Ahmed et al.15 present a multistage stochastic integer programming formulation for a multifacility capacity expansion problem under uncertainty. They use a fixed-charge linear function to model the economies of scale in expansion costs. Ahmed and Sahinidis16 proposed the use of the upper partial mean (UPM) as a measure of the variability of recourse costs. UPM is the expectation of the positive deviation (from the expected value) of the second-stage profit. The authors proposed to use a bounding heuristic procedure to decompose the multiscenario structure to find suboptimal solutions. This paper addresses only two-stage problems, which are a very special case of the multistage problems we consider in this paper. Two-stage problems coarsely divide time into “now” and the “future” to approximate the underlying problem and to be more numerically tractable. However, we believe that multiple stages are an inherent and indispensable part of planning problems. From a formulation perspective, Barbaro and Bagajewicz17
2194 Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004
argued that, with the definition of UPM, a solution may falsely reduce its variability just by not choosing optimal second-stage decisions. Therefore, they concluded that UPM is not an appropriate measure to assess and manage risk. On the other hand, approaches based on optimal control place their emphasis more on analytical solutions and structural results. Manne18 considers the optimal degree of excess capacity that a company should build in a new facility and finds the closed-form optimal solutions based on principles similar to dynamic programming. Davis et al.19 formulate a model of capacity expansion as a stochastic control problem and develop algorithms for computing optimal policy by utilizing the special structure in the Bellman equations. Bean et al.20 consider a capacity planning problem with stochastically growing demand over an infinite horizon. They show that, under certain conditions, they can transform the stochastic problem into an equivalent deterministic problem. Eberly and Van Mieghem21 and later Harrison and Van Mieghem22 present a framework to study multiresource investment under uncertainty in a dynamic, nonstationary environment. They show that the optimal investment strategy follows a control limit policy at each point in time. Angelus et al.23 develop and characterize the optimal policy for a dynamic capacity expansion model in semiconductor manufacturing. They show that, under certain assumptions, the optimal expansion policy is a variation of the (s, S) policy from inventory theory. Rajagopalan et al.24 study a model of capacity and technology acquisition and replacement in environments where a company anticipates a sequence of technological breakthroughs with uncertainty in magnitude and timing. On the basis of the structural properties of the optimal solutions, they develop an efficient regeneration point-based dynamic programming algorithm to solve moderate size problems. Recently, a few research works attempt to integrate capacity planning and inventory control. Angelus and Porteus25 consider the problem of simultaneous production and capacity management under nonstationary, stochastic demands in a produce-to-stock environment. They show that the optimal capacity policy is a target interval policy in both the perishable goods and durable goods cases. Rajagopalan and Swaminathan26 explore the interaction between production planning decisions and capacity acquisition decisions in environments with deterministic demand growth. The paper presents a mixed-integer nonlinear programming model to determine the optimal capacity acquisition, production, and inventory decisions over time. Bradley and Glynn27 develop approximations that yield insight into the joint optimization of capacity and inventory in a singleproduct, single-station, make-to-stock manufacturing system in which inventory is managed through a basestock policy. Our incentive in this work is to optimize jointly capacity and inventory decisions with respect to multiple objectives, e.g., expected profit and risk exposure. The primary purpose of this work is twofold: (1) propose parametrized policies for complex problems based on structural results from simplified problems and (2) devise numerical solution strategies to compute the values of these parameters. The methodology we choose to formulate and solve this problem belongs to the research stream of optimal control. Optimal control has
the following advantageous features compared to stochastic programming in solving this particular problem: (1) Optimal control provides the capability to solve an analytically tractable approximation of the problem to obtain structural insights. For example, a majority of the structural results in inventory control and capacity planning is based on optimal control theory.1,18 Stochastic programming, such as scenario-based multistage optimization, on the other hand, focuses more on numerical solutions and is not able to obtain such useful insights. (2) Stochastic programming approaches are only suitable to solve problems where the number of decision periods is relatively small and one can use a small scenario tree to represent future uncertainty. With the incorporation of inventory control, the problem size can easily grow out of hand for scenario-based stochastic programs. Simulation-based optimization, within the framework of optimal control, is more computationally efficient in handling problems with a large number of periods and scenarios. We will resort to optimal control methodology in this work, based on the specific nature of the problems we are considering. We shall see that these two methods are not competitive. Instead, they are merely complementary, with each having different favorable and unfavorable features. The problem-specific requirements may imply the preferable approach. For a discussion on the comparison between these two approaches and combinations of them to solve large-scale problems, we refer readers to our forthcoming work.28 Stochastic optimal control, also referred to as a Markov decision process, is a sequential decision problem in which the decision makers choose an action based on the state at any decision epoch according to a decision rule or policy. Dynamic programming provides a framework for studying such problems, as well as for devising algorithms to compute an optimal control policy. Several comprehensive textbooks exist on this subject (e.g., Bertsekas10 and Puterman11). Based on the principle of optimality,29 dynamic programming decomposes the optimal control problem into a sequence of single-period subproblems that one solves recursively and backward in time. Research extended to multi-objective cases has enabled the development of decomposition methodologies for separable multi-objective dynamic optimization problems. Li and Haimes30 review the major concepts used in multi-objective dynamic programming and examine the current progress in the development of corresponding theory and methodology. Li31 considers a general separable class of multi-objective optimization problems and develops a generating approach using a weighting method to find the set of noninferior solutions. Cheng et al.32 develop a multi-objective stochastic dynamic programming algorithm based on the epsilonconstraint method to solve multi-objective Markov decision problems. However, because one has to carry out an optimization for each state in what is usually a very large state space, dynamic programming suffers numerically from the “curse of dimensionality”, making it unsuitable for solving large-scale problems. Despite its numerical intractability, dynamic programming provides a framework to obtain analytical results and the structural properties of optimal solutions. For instances, Scarf1 shows that an (s, S) type policy is optimal for a dynamic inventory problem that has an ordering cost composed of a unit cost plus a
Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004 2195
Figure 1. Multistage two-level decision process in a typical year.
reorder cost. Eberly and Van Mieghem21 show that the optimal investment strategy in a multiresource investment with a convex capacity adjustment function follows a control limit policy at each point in time. These structural results allow one to develop numerical approaches to find these parametrized policies. Glasserman and Tayur5 and Fu6 develop simulation-based methods to estimate sensitivities of inventory cost with respect to the policy parameters such as the base-stock levels, which they then use in gradient-based techniques to compute the optimal parameters. 3. A Motivating Example In this section, we present an illustrative example of joint capacity planning and inventory control under uncertainty. We carry this example from formulation to solution to demonstrate the concepts and to provide a physical feel for the underlying problem. Assume that a firm is designing a manufacturing system, e.g., a reaction process, which operates over a planning horizon, e.g., 5 years. The primary sources of uncertainty in the operating environment are the future demand and technology development. The firm believes that a new technology breakthrough, e.g., catalyst β, is possible soon. It would make the existing technology, e.g., catalyst R, obsolete. We reflect the impact of the new technology in a downward product pricing pressure and a loss of investment value for the existing equipment. In such a changing and uncertain environment, the firm must make strategic decisions on the timing, types, and sizes of the capacity investments over time. At the same time, at the operational level, the firm must make decisions on production planning and inventory control on a real-time basis to meet the customer demands, which are changing over time and inherently involve substantial uncertainty. Figure 1 illustrates the decision-making process under uncertainty in a typical year. At the beginning of this year, based on the current information and assessment of future uncertainty, the firm must make a decision “now and here” on the capacity level of the production process. The firm then implements the decision, possibly installing more capacity. At the beginning of each month during this year, the firm reviews the current inventory level and decides production for that month to replenish the inventory while constrained by the available capacity. They know the actual demand for this month only as orders arrive, and they satisfy these demands to the extent possible from available inventory. Any remaining inventory (or stockout in the case where they have insufficient inventory to satisfy demand) carries over to the next month. The firm continues production until it reaches the end of this year, at which time the firm is facing a problem similar to the one it faced at the beginning of the year. Again, on the basis of information available then and the
assessment of future uncertainty, the firm decides on a possibly different capacity level and continues into the next year. A typical characteristic of this type of problem is the interaction among the decisions occurring at different times, e.g., now and future, and at various levels, e.g., strategic and operational. Another characteristic is that the firm makes these decisions periodically based on the uncertainty that resolves itself over time. In the face of the uncertainty in the environment and the complexity in the decisions at various times and levels, the firm must properly make these decisions to satisfy multiple objectives. For example, the firm would like to maximize the expected value of the discounted cash flows over a planning horizon. On the other hand, the firm is also concerned with its risk exposure or the adverse consequence of the variability it could experience in profit. Because of the possible conflicts between these objectives, e.g., decisions resulting in a higher expected return often lead to higher risk, it is not possible to find a single decision that optimizes all objectives simultaneously. In other words, tradeoffs have to be made between these objectives. The decision makers need only to consider a set of solutions, which we refer to as the Pareto optimal set, that are not inferior to any other solutions and select a solution, based on their preferences, from this solution set. 4. Problem Formulation We formulate this type of problem as a finite-horizon discrete-time Markov decision process, in which decision makers periodically make capacity and production decisions according to certain decision policies, based on the state information available at the time of the decision. Controlled by these decisions, the process evolves over time and generates a sequence of rewards, or profits. With the definition of multiple criteria that measure the performances of the decision process, the problem becomes a multi-objective Markov decision problem, where one searches for both capacity and inventory policies to optimize these multiple objectives. 4.1. Markov Decision Processes. We consider a firm that employs L different resources or technology types to produce a single product in a time horizon of N periods (years, for instance). The firm has the option to change the capacity level of each resource at the beginning of each year t ∈ {1, ..., N}, which normally corresponds to the planning process within the firm. During each year, the firm periodically reviews the inventory level and makes a production plan at the beginning of each subperiod, e.g., month, τ ∈ {1, ..., M}. Demands in distinct months are statistically independent and follow known probability distributions. The following summarizes the sequence of events during each year:
2196 Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004
(a) At the beginning of each year t ) 1, ..., N. Having observed the information available, the firm chooses the new capacity level Kt ∈ R L+. (b) At the beginning of each month τ during this year. On the basis of the current inventory level Itτ, the firm plans production, constrained by the capacity, to replenish the inventory up to level ytτ. Demand in month τ, Dtτ, arrives, and the firm satisfies it to the extent possible from inventory, carrying excess inventory to the next month or backlogging unsatisfied demand, i.e., It,τ+1 ) ytτ - Dtτ, where It,τ+1 is positive for excess and negative for backlog. (c) At the last year N + 1. The firm salvages the remaining resources and scraps the remaining products. The capacity and production planning problem becomes a decision process, in which decision makers periodically review the state of the system, e.g., capacity level Kt-1 and inventory level Itτ, and choose capacity and production decisions according to policies µtτ that prescribe a control action for each possible state. For example, Kt ) µt0(Kt-1,It0) gives a new capacity level, where µt0 is the capacity policy that maps the information of the current capacity and inventory level at year t into a capacity decision. Similarly, ytτ ) µtτ(Kt,Itτ) gives a production decision, where µtτ is the inventory policy in month τ of year t. We refer to the sequence π ) {µ10, µ11, ..., µNM} as a decision strategy that consists of capacity and inventory policies at each decision epoch. The decision process we describe above is a Markov decision process. We use the qualifier “Markov” because decisions depend on the past only through the current state of the system. At each stage, after a decision is chosen and implemented, the process evolves as the system transits to another state, depending on the choice of the decision and the realization of uncertainty; for example, the inventory level after production in month τ, ytτ, subtracted by the demand in that month, Dtτ, becomes the inventory level at the beginning of the next month τ + 1, i.e., It,τ+1 ) ytτ - Dtτ. The decision process under a given decision strategy then is a controlled Markov chain, in which the system transits from state to state according to given transition probabilities. In this particular example,
pij(k) ) P(Dtτ ) ytτ - It,τ+1|Itτ ) i,It,τ+1 ) j, Kt ) k) (1) defines the transition probability associated with the inventory level, where ytτ ) µtτ(Kt,Itτ) and pij(k) denotes the probability that the inventory level in the next period is j, given that the current inventory level is i and the current capacity level is k. 4.2. Multiple Optimality Criteria. At each stage, the firm’s decision causes a transition in the state, i.e., a change in the capacity and/or the inventory level, and at the same time, the firm receives a profit or incurs a cost that depends on the current state, decision choice and demand realization. In other words, as the controlled Markov chain proceeds from state to state, there is a reward sequence (often referred to as a Markov reward process) associated with the trajectory (state sequence) of the Markov chain. Note that we shall discount all cost parameters to the beginning of the first year according to a monthly discount factor, R. We then drop the discount factor from the cost functions in the paper. (a) Capacity-related contributions: cK(Kt - Kt-1) gives the cost associated with adjusting the capacity level. The
firm incurs a cost, associated with maintaining equipment, of cM for each unit of capacity held for a year. Thus,
vt0(Kt,Kt-1) ) -cK(Kt - Kt-1) - cMKt
(2)
gives the reward (minus capital cost) that the firm receives at the beginning of each year. (b) Production-related contributions: Each unit the company produces incurs a variable production cost cP. The company sells each unit of product, up to the demand of that month, at a price of pS. It backlogs unmet demand, delaying it by at least 1 month, and charges a shortage penalty cS and a delay cost (1 - R)pS for each unit of unmet demand. Finally, it carries unsold products to the next month as excess inventory, which incurs a unit holding cost of cH. Thus,
vtτ(Itτ,ytτ,Dtτ) ) γtτ(ytτ,Dtτ) + cPItτ
(3)
gives the reward (operating profit) during each month, where
γtτ(ytτ,Dtτ) ) pSDtτ - cH(ytτ - Dtτ)+ [(1 - R)pS + cS](ytτ - Dtτ)- - cPytτ (4) where we define x+ ) max (x, 0) and x- ) max (-x, 0). (c) Terminal value: At the last year, the company salvages all remaining equipment at a value of rD per unit. The firm scraps the leftover products at a price of pF; the cumulative unmet demand becomes lost sale and incurs a charge of pS per unit. Thus, + - pSIN+1,0 (5) vN+1,0(KN,IN+1,0) ) rCKN + pFIN+1,0
gives the reward (terminal value) at the end of the last year. We can express the total discounted reward, given that the initial state is (K0, I00) and the company applies a decision strategy π, as follows: N
vπ(K0,I00) )
∑ t)1
M
(vt0 +
∑vtτ) + vN+1,0
(6)
τ)1
After choosing and implementing the decision strategy π, the decision maker receives the outcomes of the system performance, such as returned profit, customer service, and process lifetime. It is the decision maker’s incentive to choose a decision strategy such that these outcomes are as “good” as possible, in other words, optimal. Thus, the decision maker requires performance measures, or optimality criteria, to compare alternative decisions. In this work, we restrict ourselves to profit,related performance measures, but one can readily generalize the results we obtain here to incorporate other criteria. Even though profit is the only outcome with which we are concerned, we are not dealing with a problem with a single objective because of the stochastic nature of the profit itself. As shown in eq 6, the total discounted profit is a random variable, which requires the definition of deterministic performance measures to compare different outcomes. In this work, we consider two optimality criteria: expected total discounted profit and expected downside risk. Again, one can extend the results we obtain in this paper to cases with more than two objectives.
Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004 2197
(a) Expected profit: The most common optimality criterion in the literature is the expected total discounted reward, assuming decision makers are riskneutral.
F1π(K0,I00) ) E[vπ(K0,I00)]
(7)
gives the expected discounted total profit under decision strategy π, where one takes the expectation E[‚] with respect to the probability measure the controlled Markov chain induces, with such transition probabilities as those described in eq 1. (b) Downside risk: However, decision makers are usually risk-averse; i.e., they are unwilling to take a risk, which one often measures as the variability of the return. We will use expected downside risk to measure the downside variability of the profit.12 Downside risk is the amount by which total profit vπ falls below a target profit vˆ , i.e., max (vˆ - vπ, 0). We define the expected downside risk associated with profit vπ(K0,I00), for a given target value of vˆ , as
F2π(K0,I00) ) E{max [vˆ - vπ(K0,I00), 0]}
(8)
The problem then becomes a multi-objective Markov decision problem, which searches for policies that optimize multiple objectives, i.e., maximize expected profit and minimize expected downside risk.
max [F1π(K0,I00), -F2π(K0,I00)] π∈Π
stτ ) (Kt,Itτ,Wtτ)
t ) 1, ..., N + 1
t ) 1, ..., N, τ ) 1, ..., M
We next define two optimal “objective-to-go” functions for our problem: J1t (st0) and J2t (st0) are the maximal expected profit-to-go function (i.e., the maximal expected value of the profit we expect to obtain for the remaining periods) and the minimal expected risk-to-go function, respectively, starting with state st0 in year t. Similarly, 1 2 Γtτ (stτ) and Γtτ (stτ) represent the maximal expected profit-to-go function and the minimal expected risk-togo function, respectively, starting with state stτ in month τ of year t. Then we state the Pareto optimality equations at the beginning of each year t e N as
[
{
]
1 J1t (st0) vt0 + Γt1 (Kt,It0,Wt0+vt0) ) max 2 2 1 ,Γ2 Kt,Γt1 -Jt (st0) -Γt1(Kt,It0,Wt0+vt0) t1
(10) (11)
|
}
1 2 Kt ∈ Kt(Kt-1), (Γt1 , Γt1 ) ∈ Ωt1
(9)
Because of the possible contradictions among multiple objectives, it is not possible to find a single solution that would be optimal for all of the objectives simultaneously. The goal of multi-objective optimization (9) is to find the set of Pareto optimal solutions to the multi-objective decision problem. The definition of Pareto optimality of a optimization with Q objectives, i.e., maxx∈X (F1, F2, ..., FQ), is as follows. Definition 1 (Pareto Optimality). A solution x* ∈ X is a Pareto optimal if there does not exist another solution x ∈ X, such that Fi(x) g Fi(x*) for all i ) 1, 2, ..., Q and Fj(x) > Fj(x*) for at least one index j. In other words, no other solution point is better or equal in all objectives and strictly better in at least one. 4.3. Pareto Optimality Equations. To apply a dynamic programming decomposition technique, we need to recast the problem such that it satisfies the conditions of separability and monotonicity.32 To gain these two properties, we augment the state vector with another state, Wtτ, which denotes the accumulated wealth up to month τ of year t, discounted to the beginning of the first year. On the basis of the principle of optimality in multi-objective dynamic programming, we are able to decompose the multi-objective Markov decision problem (9) into a sequence of single-stage multi-objective optimizations, which we refer to as “Pareto optimality equations” as the counterpart of optimality equations in single-objective dynamic programming. We denote the augmented state vector stτ, which is defined by
st0 ) (Kt-1,It0,Wt0)
Figure 2. Pareto optimal frontier for a given state.
(12)
where vt0 ) vt0(Kt,Kt-1) is the capacity adjustment cost/ revenue given in eq 2 and Kt(Kt-1) denotes the set of possible new capacity levels given the current capacity level. Ωt1(st1) is the Pareto optimal set with respect to the expected profit-to-go and risk-to-go functions in the first month of year t. In other words, the pair of future 1 2 objectives-to-go (Γt1 , Γt1 ) must correspond to a point on the Pareto optimal frontier for the next state st1, according to the principle of optimality.32 Figure 2 illustrates an example of the Pareto optimal frontiers, indicated by the dashed lines, associated with two different next states resulting from two different capacity decisions (assuming there are only two alternatives), one with no change and the other with an expansion. The search in the objective space should be carried only along these Pareto optimal frontiers, added/subtracted by the one-period objective functions (i.e., the expected profit-to-go function at the current period reduced by the applicable expansion cost). The Pareto optimal frontier for state st0 at year t is then the set of nondominated solutions on these modified Pareto optimal frontiers, as indicated by the solid line in Figure 2. Similarly, the Pareto optimality equations in each month τ ) 1, ..., M in year t ) 1, ..., N are
[
]
Γ1tτ(stτ) ) -Γ2tτ(stτ) max
1 2 ytτ,Γt,τ+1 ,Γt,τ+1
{ [ EDtt
1 vtτ + Γt,τ+1 (Kt,ytτ-Dtτ,Wtτ+vtτ) 2 -Γt,τ+1(Kt,ytτ-Dtτ,Wtτ+vtτ)
1 2 ytτ - Itτ ∈ X(Kt), (Γt,τ+1 , Γt,τ+1 ) ∈ Ωt,τ+1
}
]| (13)
where vtτ ) vtτ(Itτ,ytτ,Dtτ) is the operating profit function given in eq 3 and X(Kt) is the feasible set for a
2198 Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004
production decision, which we define as
X(Kt) ) {x|Ax e Kt, x g 0}
(14)
A is the linear technology matrix (here a vector) in which each element represents the amount of corresponding resources, e.g., production capacity, required to produce one unit of product. We define the connecting equations at the end of each year τ ) M + 1 and the boundary conditions and in the last year t ) N + 1 respectively as 1 (Kt,It,M+1,Wt,M+1) Γt,M+1
)
salvage revenue, respectively, with cE > rD, cE g 0, and rD g 0. The important property of this kinked piecewise linear function is that it is “jointly convex” in (Kt, Kt-1). Consider a single-objective problem (maximizing the expected profit) with the following optimality equations (t ) 1, ..., N, τ ) 1, ..., M) 1 J1t (st0) ) max {vt0 + Γt1 (Kt,It0)|Kt ∈ Kt(Kt-1)} Kt
1 (Kt,ytτ-Dtτ)]|ytτ Γ1tτ(stτ) ) max {EDtt[vtτ + Γt,τ+1 ytτ
Itτ ∈ x(Kt)} (22)
1 Jt+1 (Kt,It,M+1,Wt,M+1)
t ) 1, ..., N (15)
2 2 (Kt,It,M+1,Wt,M+1) ) JN+1 Γt,M+1 (Kt,It,M+1,Wt,M+1) t ) 1, ..., N (16) 1 (KN,IN+1,0,WN+1,0) ) vN+1,0(KN,IN+1,0) JN+1
(21)
(17)
2 (KN,IN+1,0,WN+1,0) ) max [vˆ - WN+1,0 - vN+1,0 JN+1 (KN,IN+1,0), 0] (18)
We solve the multi-objective optimizations at each stage presented in eqs 12 and 13 to find the Pareto optimal frontier for each starting state at that stage, given that we have already found the Pareto optimal frontier at the subsequent stage. Therefore, by solving these optimizations, we can calculate and propagate the Pareto optimal frontier and corresponding solutions recursively when we proceed backward in time. As we reach the first stage, i.e., the beginning of the first year, we find the Pareto optimal solution set for the whole problem, i.e.,
We shall first introduce two lemmas on concavity (in maximization) preservation. On the basis of these lemmas, we can prove the concavity of the objective-togo functions through induction from the optimality equations. We can show this first lemma straightforwardly using the definition of concavity. Lemma 1 (Concavity Preservation). (a) If f1 and f2 are concave, then Rf1 + βf2 is also concave for all R > 0 and β > 0. (b) If f is concave and ω is a random variable, then Eω{f(x-ω)} is also concave, provided Eω{|f(x-ω)|} < ∞ for all x. Lemma 2 is similar to the convexity preservation theorem from Heyman and Sobel33 (p 525) with the exception that here we consider the preservation of concavity in maximization. Lemma 2 (Concavity Preservation in Maximization). Let X ⊆ R n be a nonempty set and Ax ⊆ R m a nonempty set for each x ∈ X. Let C ) {(x, y): y ∈ Ax, x ∈ X} be a convex set and f(x,y) be a concave function in (x, y) on C and define
g(x) ) max f(x,y)
[J11(s00), -J21(s00)] ) max [F1π(s00), -F2π(s00)] (19)
y∈ Ax
π∈Π
In other words, we require only one backward pass to solve the stochastic optimization problem, albeit a large effort even for a small problem. 5. Analytical Results It is our general guideline in this paper to find analytical results for simple problems and propose numerical solution approaches we shall base on the analytical results to solve general problems. Within the dynamic programming framework, we are able to conduct convexity analysis on certain classes of simple problems to find structural results. These structural results provide a basis for constructing and proposing parametrized policies for general complex problems. We first start with analytical solutions to a single-objective problem where the cost function is convex. We then extend the structural results to multi-objective problems and discuss some extensions to handle a nonconvex cost structure. 5.1. Single-Objective Problems. In this section, we conduct convexity analysis on the optimality equations of a single-objective problem to obtain structural insights into the optimal solutions. We start with a class of problems having a convex cost function of the form
cK(Kt - Kt-1) ) cE(Kt - Kt-1)+ - rD(Kt - Kt-1)- (20) where cE and rD are the marginal expansion cost and
x∈X
(23)
Then function g(x) is a concave function in x on any convex subset of dom g ) {x|maxy∈Ax f(x,y) > -∞, x ∈ X}. On the basis of the above concavity preservation lemmas, we are able to show that one can preserve the concavity of the objective-to-go functions at each stage recursively backward in time through the optimality equations that connect them. The concavity of the optimizations at each stage yields analytical results on the structure of optimal policies. The following theorem characterizes the form of the optimal capacity policy. Theorem 1 (Structure of Optimal Capacity Policies). (a) J1t (st0) is concave in (Kt-1, It0) for all t. (b) L H L H and Kt,i with Kt,i e Kt,i There exist two parameters Kt,i for each element in vector Kt-1, which are functions of (Kt-1,(i), It0) such that the following expansion/contraction policy is optimal in year t:
{
L L Kt,i (Kt-1,(i),It0) if Kt-1,i < Kt,i (Kt-1,(i),It0)
H H Kt,i ) Kt,i (Kt-1,(i),It0) if Kt-1,i > Kt,i (Kt-1,(i),It0) Kt-1,i otherwise
(24)
where Kt-1,(i) are the elements in vector Kt-1, excluding the ith element Kt-1,i. The proof of theorem 1 is in the appendix. This theorem says that the optimal capacity policy for this particular class of problems is a “target interval policy”,
Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004 2199
Figure 3. Structure of a target interval policy with one resource.
Figure 4. Structure of a target interval policy with two resources.
also known as a control limit policy. Figure 3 shows the structure of a one-dimensional control limit policy, i.e., for a single resource. If the existing capacity level falls below the lower threshold, KLt , the company should expand capacity up to KLt ; if the existing capacity level exceeds the upper threshold, KH t , it should contract ; finally, when the existing capaccapacity down to KH t ity level is within the interval (KLt , KH t ), it is optimal to stay with the current capacity level. For the cases with multiple resources, e.g., equipment with different types of technology, the above threshold values KLt and KH t for each resource are functions of capacity levels of other resources. We partition the multidimensional state space into various domains, one of which is called the “continuation region”, in which one should make no adjustment in the current capacity levels. From any of the domains outside the continuation region, the optimal capacity action is to adjust the vector of capacity levels to a specified point on the boundary of the continuation region. Figure 4 illustrates an example of the structure of the target interval policy for two resources. Similarly, we can characterize the structure of the optimal inventory policies using the following theorem. Theorem 2 (Structure of Optimal Inventory 1 (stτ) is concave in Itτ for all Kt. (b) Policies). (a) Γtτ Given that there is capacity Kt available in year t, a / (Kt) gives the base-stock policy with base-stock level ytτ optimal inventory level after production in month τ of year t. The proof of theorem 2 is in the appendix. We can interpret the base-stock inventory policy as the following: if the current inventory is lower than the basestock level, the decision makers should produce, as much as capacity allows, to replenish the inventory up to the base-stock level; otherwise, it should make no production for this period. Figure 5 illustrates the structure of the base-stock policy. The multiproduct production and inventory control problem is further complicated because, in addition to deciding target levels for individual products, we also need to decide how to allocate limited capacities among
Figure 5. Structure of a base-stock policy with limited capacity.
the products. This problem arises when the capacities are not sufficient to raise all products simultaneously to their base-stock levels. This topic is beyond the scope of this work. A few research papers have addressed this issue.3,34 By including previous demand realizations in the state vector, we can readily extend the results obtained here to the cases where future demands are correlated. We can also incorporate a sequence of technology development with uncertain timing and magnitude, which we model as a Markov chain with a one-step transition probability.32 The structures of the optimal capacity and inventory policy still hold. However, the policy parameters, e.g., capacity thresholds, now depend on the augmented state. 5.2. Multi-objective Problems. In this section, we consider the problems with multiple objectives, e.g., maximizing the expected net present value and minimizing the expected downside risk. We apply the multiobjective dynamic programming algorithm developed in our previous work32 to convert multi-objective optimizations (12) and (13) into single-objective optimizations using the -constraint method, in which we select one objective function to be optimized and convert the others into constraints by setting a bound on each of them (Chankong and Haimes35). In this paper, we shall maximize the expected profit-to-go function while constraining the expected risk-to-go function by tτ, defined as the upper bound on the expected risk-to-go function in month τ of year t. We convert the Pareto optimality equations for capacity decisions (12) to the following single-objective optimality equations:
J1t (st0,t0) ) 1 (Kt,It0,Wt0+vt0)|Kt ∈ Kt(Kt-1), max {vt0 + Γt1
1 ,Γ2 Kt,Γt1 t1
1 2 (Γt1 , Γt1 ) ∈ Ωt1} 2 (Kt,It0,Wt0+vt0) e t0 s.t. Γt1
(25)
The optimization above is to find a capacity decision, Kt, such that one maximizes the expected profit-to-go function at year t while the expected risk-to-go function associated with the resulting state must not exceed the risk upper bound t0. We can also describe the solution procedure with the following two steps: 1 2 (1) For each Kt, find (Γt1 , Γt1 ). For a given initial state st0 and risk level t0, having selected a capacity decision Kt, we can obtain the Pareto optimal frontier with respect to the resulting state st1, as shown in Figure 6. One should search for a point in the Pareto optimal solution set that maximizes the profit-to-go function while bounding the risk-to-go function at or below the level t0. (2) Select Optimal Kt. Adjust the capacity decision Kt and repeat step 1 to find the Kt ∈ Kt(Kt-1) that yields the maximum profit-to-go function obtained in step 1.
2200 Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004
(P2) J1t (st0,t0) ) max
1 ,Γ2 ,W Kt,Γt1 t1 t1
1 {vt0 + ETt|Tt-1Γt1 (Kt,Tt,It0,Wt1)| 2 2 , Γt1 ) ∈ Ωt1} (27) Kt ∈ Kt(Kt-1,Tt), (Γt1
2 s.t. ETt|Tt-1Γt1 (Kt,Tt,It0,Wt1) e
Wt1 eWt0 + vt0 Figure 6. Selection of a Pareto optimal point for a given state.
In this manner, we find one Pareto optimal point and the corresponding solution for the given state st0 and risk level t0. By successively decreasing t0, we find all points on the Pareto optimal frontier and the corresponding solutions for a given state st0. The risk level t0 then becomes an augmented state that parametrically characterizes the set of Pareto optimal solutions at the beginning of year t. In the presence of uncertainty in technology development, we augment the state vector with the technology level, denoted by Tt-1, at the beginning of year t. We write the optimality equations as
(P1) J1t (st0,t0) ) max {vt0 +
1 ,Γ2 Kt,Γt1 t1
Property 1 allows us to establish the equivalence between P1 and P2. Because of the monotone property 1 in Wt1, the equality equation Wt1 ) Wt0 + vt0 will of Γt1 hold at optimality in P2. We can show the convexity of the constraint set by observing that, if h(x) is a convex function, then the set Λ ) {x|h(x) e 0} is convex. In other words, because the two constraint functions are convex, the feasible set defined by these inequality constraints is convex. Theorem 3 (Structure of Optimal Capacity Policies for Multi-objective Problems). (a) J1t (st0,t0) is jointly concave in (st0, t0) for all t. (b) There exit two L H L H parameters Kt,i and Kt,i with Kt,i e Kt,i for each element in vector Kt-1, which are functions of Kt-1,(i), Tt-1, It0, Wt0, and t0 such that the following expansion/contraction policy is optimal in period t. Kt,i )
1 ETt|Tt-1Γt1 (Kt,Tt,It0,Wt0+vt0)|
Kt ∈ Kt(Kt-1,Tt-1),
1 2 (Γt1 ,Γt1 )
∈ Ωt1} (26)
2 (Kt,Tt,It0,Wt0+vt0) e t0 s.t. ETt|Tt-1Γt1
where we compute the expectation ETt|Tt-1[‚] over the conditional probability that defines the Markov chain modeling the development of the technology. The choice 2 of the expected risk-to-go function at the next stage Γt1 in general depends on the state at the next stage st1, i.e., on both the selection of the capacity decision Kt and the realization of the random variable Tt. A decision on 2 Γt1 seems similar to, but is fundamentally different from, the recourse decision in a two-stage stochastic programming problem. Recourse decisions are independent of one another, and one makes them after the realization of the random variable. Here we must choose 2 for all realizations of Tt before we know the actual Γt1 realization of Tt. We must make these choices such that we do not violate the constraint on the expected riskto-go function at the beginning of year t. The following theoretical development will show that the concavity of the problem still holds after one incorporates other objectives. Therefore, we can use our previous findings to obtain similar structural results. 1 Property 1. Γt1 is monotonically increasing in Wt1 2 1 2 for a given Γt1 with (Γt1 , Γt1 ) ∈ Ωt1. We can show property 1 from the optimality equations through induction starting from the last period. Crudely, an increase in the profit that we have already made leads to a decrease in the profit target and thus a decrease in the future risk. Therefore, we can make more profitable decisions in the future and still satisfy our risk constraint. Lemma 4. P1 is equivalent to the following problem P2, in which the constraint set is convex.
{
L L Kt,i (Kt-1,(i),Tt-1,It0,Wt0,t0) if Kt-1,i < Kt,i (Kt-1,(i),Tt-1,It0,Wt0,t0) H H (Kt-1,(i),Tt-1,It0,Wt0,t0) if Kt-1,i > Kt,i (Kt-1,(i),Tt-1,It0,Wt0,t0) Kt,i Kt-1,i otherwise
(28)
The proof of theorem 3 proceeds similarly to the proof of theorem 1. We can show that optimality equations (27) recursively and backward in time preserve the concavity of J1t in (st0, t0). We can obtain similar results for inventory policies, for which a modified base-stock policy is optimal. The / , is a function of Kt, Wtτ, and tτ. base-inventory level, ytτ 5.3. Some Extensions and Discussions. The above analysis is only valid for a restricted problem class with concave/convex properties, for example, when the cost function is convex and the capacity changes are continuous. Under these simplifying conditions, policies with such structures as those described in eq 28 are optimal. However, in many practical cases, problems are too complicated to be analytically tractable. In these general settings, the structural results can still provide valuable insights about the solution to complex problems and can form the basis for suboptimal control schemes. The simplest nonconvex cost structure is the fixedcharge linear cost function, which many have used to model the economies of scale in capacity expansion costs,14,23
cK(Kt - Kt-1) )
{
CE + cE(Kt - Kt-1) if Kt > Kt-1 if Kt ) Kt-1 0 (29)
where CE is the fixed charge associated with capacity expansion, regardless of the amount of the capacity expansion.
Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004 2201
The presence of the fixed charge destroys the convexity of the problem. However, the K convexity that Scarf1 introduces provides the possibility of incorporating economies of scale. On the basis of the concepts and theorem on K convexity, Angelus et al.23 prove the optimality of an (s, S) expansion policy for a capacity expansion problem with fixed charge. An (s, S) policy states that there exist positive real functions, expansion point s and expansion level S with s e S, such that, if the existing capacity level is lower than st, it is optimal to expand the capacity up to S; it is optimal to stay at the current capacity otherwise. The gap between s and S represents the minimal size of an optimal expansion, the existence of which is due to the capacity-independent cost CE. As CE becomes smaller, this gap becomes smaller and eventually vanishes when CE ) 0. Following the same arguments, we may incorporate the economies of scale by enforcing a minimal capacity expansion or production volume. For example, the target interval policy defined by a lower target KLt and an upper target KH t suggests that one should bring the capacity level into the interval with minimal effort. The existence of the fixed charge CE, however, does not justify a very small capacity change and requires a minimal capacity expansion amount instead. We can define an expansion point KEt with KEt e KLt such that if the current capacity is lower than KEt , we will expand the capacity up to the lower target KLt . Similarly, a contraction point KCt (KCt g KH t ) denotes the threshold above which the capacity should be brought down to the upper target level KH t . 6. Solution Strategies We find structural results for a certain class of problems that are analytically tractable in the previous section. These results are very useful in developing parametrized policies with these known structures for more general and complex problems. We believe that, in alignment with previous work,5-7 a useful scheme to solve realistic problems is as follows: solve a tractable approximation of the real system to obtain structural insights and to propose candidate policies; evaluate and improve the proposed policies with simulation of the real system. In this section, we develop a simulation-based optimization framework to find the parameters that define the parametric policies that are optimal for a certain problem class and are suboptimal, but hopefully near optimal, for general problems. 6.1. Parametrized Control Policies. We propose the following guideline to tackle complex problems: (1) propose parametrized policies for complex problems based on structural results from simplified problems and (2) devise numerical solution strategies to compute the values of these parameters. One difficulty in developing the solution strategies is that the parameters themselves are functions of elements of the states. For / example, the parameter in the base-stock policy, ytτ , is a function of Kt, Wtτ, and tτ. Therefore, we have to search for parameters defined over a subset of the state space, which can still be large as a result of the large number of dimensions and the continuous nature of the states. To overcome this difficulty at least partially, we first argue that we can drop the dependence of the policy parameters on the risk-to-go level, tτ. Multi-objective optimization involves two search spaces, i.e., the deci-
Figure 7. Search in the two-dimensional objective space.
sion space and the objective space. In a dynamic programming recursion, at each period, we parametrically characterize the Pareto optimal frontier, i.e., the search space with respect to objectives, and the corresponding solutions for the rest of the future using a set of risk-to-go levels, as illustrated in Figure 7. The approach we will now describe to solve larger problems is to specify parameter values for a policy of specified structure. We will estimate the expected values of both profit and risk for that policy by simulating and averaging over many demand and technology realizations over the planning horizon. We will then use optimization to find policy parameters that correspond to the points on the Pareto front for the starting state of our problem. We used the risk-to-go function to allow us to decompose the problem and propagate the Pareto fronts backward in time, but we do not need it as a part of the state for a simulation that estimates risk only as an outcome at the end of the horizon. As Figure 7 shows, while we carry the search in the objective space only along the Pareto fronts in dynamic programming, we shall search in the two-dimensional objective space starting from a point that is not necessary in the Pareto set. We know that, in principle, the intermediate points we reach at optimality must reside on the Pareto fronts that we describe above at those times. To deal with the dependences of the parameters on the other states, i.e., Kt and Wtτ, we propose two alternative approximation schemes: (1) discretize the partial state space into a finite number of regions, for each of which we define a set of parameters; (2) approximate the functional form by a parametric function, e.g., a linear combination of the state variables. The design of approximation schemes must rely on an understanding of the underlying specific problem and a validation through numerical experiments. 6.2. Simulation-Based Optimization Framework. Incorporating the structures of the policies into a modelbased optimization, i.e., a mathematical program, will introduce disjunctions and discontinuity and increase the complexity of the problems. In contrast, simulationbased approaches are more computationally tractable to compute parameters when we know the structure of the policies. This is especially valid when the system simulation and control calculation are relatively computationally inexpensive. We propose a simulation-based optimization framework to find the values of the parameters at each stage k ∈ {1, ..., H}, e.g., capacity threshold values and basestock levels, that optimize multiple objectives over a planning horizon of H stages. The simulation-based optimization framework comprises three modules: simulator, controller, and optimizer, as shown in Figure 8.
2202 Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004
Figure 8. Framework for simulation-based optimization.
(1) Simulator. The simulator consists of a simulation model and a random number generator. We represent the simulation model by the system equation
sk+1 ) fk(sk,uk,ωk)
k ) 1, ..., H
(30)
which describes the system state transition at each stage after we make the decision uk and observe the random variable ωk. The random number generator draws a finite number of samples, {ω1, ..., ωs}, from the probability distributions to approximate the underlying probability space. Each sample ωs, also called a scenario, is a sequence of realizations of all random variables in the stochastic process, i.e., ωs ) {ωs1, ..., ωsH}. We can also generate these scenarios from the probability distributions directly, with a specified probability for each scenario, when the probability distributions are discrete or when we apply a discretization scheme to continuous distributions.36,37 (2) Controller. We represent the controller by the parametrized control policies µk(sk,θk), for example, eqs 23 and 24, which map states into capacity and production decisions. The controller accesses the state of the system sk during the course of simulation, evaluates the policies a given parameter vector defines to find the control action, i.e., uk ) µk(sk,θk), and implements the decision in the simulation model. The integration of the simulator and controller provides a tool to evaluate the candidate policies defined by a parameter vector in the simulation of the real system. The simulator evaluates the performance measures of the proposed policies, such as the expected profit and expected downside risk, and possibly the derivatives of these performance measures with respect to the parameters and returns these to the optimizer. (3) Optimizer. The optimizer searches for the Pareto optimal solutions that trade off the competing objectives. It can exploit either nongradient direct search or gradient-based techniques. In a nongradient optimization approach, e.g., an evolutionary algorithm, the optimizer treats the simulation model with the controller as a “blackbox” that returns a vector of performance measures as output for a given vector of parameters as input. In gradient-based methods, the simulation model provides not only values of objective functions but also estimates of the derivatives. The optimizer then uses this information to adjust the parameter vector and to improve the parametrized policies through a stochastic
version of a gradient search technique, e.g., the Robbins-Monro algorithm,5 or stochastic approximation scheme.7 In summary, we can describe the procedure to find the Pareto optimal values of the parameters that define the proposed policies as follows: (1) A simulator generates a sequence of samples from the probability distributions and simulates the system represented by the system equations over a planning horizon. (2) At each stage when the simulator requires a decision, it calls the controller, passing it the current system state. The controller evaluates the policies predefined by a parameter vector and returns the corresponding decision to the simulator. (3) The simulator implements the decision and marches forward until it reaches the end of the planning horizon, which ends one simulation run. After a large number of simulation runs, the simulator estimates the expected values for the performance measures. (4) The optimizer adjusts the parameters, based on the objective function values, to improve and optimize the multiple objectives, i.e., to find points on the Pareto front. The optimizer iterates the above steps, adjusting the parameters at each iteration to improve the performances, until the parameters converge to their optimal values or some stopping criteria are satisfied. 6.3. Multi-objective Evolutionary Algorithm. The optimizer must be capable of solving a multi-objective optimization, in other words, finding the Pareto optimal solutions to the problem. Cohon38 classifies various methods for handling multi-objective optimizations into two categories: generating methods and preferencebased methods. Preference approaches take the decision makers’ prior preference into consideration and convert a multi-objective optimization into a single-objective optimization. Generating approaches produce a set of good compromises or tradeoffs, known as Pareto optimal solutions, from which the decision makers will select their preferred solution based on postpreference assessment. We will focus on generating methods in this paper, assuming that decision makers do not know their preferences until they see the tradeoffs that are possible. Traditionally, we can convert a multi-objective optimization into a series of single-objective optimizations. We would then solve these single-objective optimizations with different parameter settings repeatedly to find the Pareto optimal set. For example, in the -constraint method, as illustrated in Figure 9, we select one objective to be optimized, e.g., maximizing the expected profit, while constraining the other objective(s), e.g., expected risk, by a bound . The constrained singleobjective optimization will find one point in the Pareto optimal solution set. By changing the value of and solving a single-objective problem for each , we can find the whole Pareto optimal set. Another recent research development in multi-objective optimization is to use an evolutionary algorithm that evolves a set of solution points until they all lie along the Pareto surface. The pioneer study on evolutionary multi-objective optimization appeared in the mid-1980s.39 There are numerous applications and a rapidly growing literature in this area.40-42 A multiobjective evolutionary algorithm generates and maintains a population of solutions at each iteration (or generation) rather than a single solution as in math-
Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004 2203
Figure 9. Comparison of classical and evolutionary optimizers.
Figure 10. Probability distributions for technology and demand.
ematical programming. The solution algorithm will lead these solutions to converge onto the Pareto optimal frontier while keeping the population diverse, as illustrated in Figure 9. In other words, the algorithm is able to search for multiple Pareto optimal solutions concurrently in one single run, without repeatedly finding each Pareto optimal point one at a time as in classical multi-objective optimization. In this work, we will use the multi-objective evolutionary algorithm in our simulation-based optimization framework for the following reasons: (1) Evolution algorithms generate and maintain increasingly improved solutions at each generation. This mechanism is consistent with our guideline for solving complex problems: evaluating and improving the candidate policies through simulation. (2) Evolutionary algorithms are especially suitable to solve multi-objective optimization because one can exploit how they maintain a diverse population to emphasize all nondominated solutions equally. (3) Evolutionary algorithms are able to handle nondifferentiability and discontinuity that often appear in a simulation-based optimization approach. For example, the incorporation of structural policies, e.g., (s, S) policies, into optimization will cause discontinuity at the threshold points. However, evolutionary algorithms can be computationally inefficient because of the lack of gradient information and cannot guarantee global convergence except for restricted problems. Several theoretical works have concentrated on the convergence of multi-objective evolutionary algorithms to the Pareto optimal set.43,44 As an alternative, we can also extend this work to gradient-based approaches using sample path derivatives.5,6 However, all of these approaches are still at the development stage and require further theoretical studies and empirical experiments. In this sense, this work is a modest beginning with the purpose of preliminarily
investigating optimization of structural policies for complex systems. 7. Numerical Results We apply the proposed solution strategies to solve the example problem of coordinated capacity planning and inventory control, which we introduced in section 3 and later formulated in section 4. We employ a simulationbased optimization framework to compute the parameters that define the candidate policies, whose structures we propose based on the analytical results in section 5. We present our numerical results and compare several approximation schemes. Finally, we present and discuss results from sensitivity analysis, which provides more insight into the underlying problem and its solutions. 7.1. Problem Descriptions and Settings. We consider a coordinated capacity planning and inventory control problem over a 5 year horizon. The firm makes capacity decisions at the beginning of each year to determine the number and type of equipment with fixed capacity to add or remove; the firm makes production decisions on a monthly basis to determine production volume and capacity allocation. The objective in solving the problem is to search for capacity and inventory policies simultaneously over the planning horizon, such that we maximize the expected profit and minimize the expected downside risk. The uncertainty we consider includes both future demand and the timing of technology β, which represent two distinctive types of uncertainty: stochastic processes and random events. We assume that the arrival time of β follows a discrete probability distribution; i.e., β arrives in each year with a specified probability as shown in Figure 10. Future demand in each month follows a normal distribution, with a mean value that increases linearly, 1% per month, over time as shown
2204 Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004
Figure 12. Comparison of difference approximation schemes. Figure 11. Convergence of solution populations to a Pareto optimal front.
in Figure 10. We then generate a finite number of scenarios, each of which is a sequence of realizations of all random variables, for both technology evolution and the demand process. We enumerate five scenarios for technology development, which indicate that β becomes available in year 1, 2, 3, or 4 or after year 5. We generate a number of scenarios, e.g., 100, for the demand process by sampling the underlying probability distributions. We consider target interval policies and base-stock policies as the candidate policies for capacity and inventory decisions, respectively. As shown in section 5, these structural policies are optimal for a certain class of problems, e.g., convex cost functions, and in general suboptimal for complex problems in the sense that the search space is restricted to a certain class of policies. We simplify the problem to finding the optimal parameters that define these candidate policies. We could add another outer loop to the simulation-based optimization framework to adjust and improve the structures of policies, which is beyond the current scope of this work. We implement the simulation-based optimization solution strategy in a C program. We choose a multiobjective evolutionary algorithm (a nondominated sorting genetic algorithm that Deb et al.45 developed) as the multi-objective optimizer. The simulator consists of a system of difference equations and a random number generation library “randlib”, which Brown et al. at University of Texas developed using the approach in L’Ecuyer and Cote.46 7.2. Solution Results and Comparisons. As a starting point, we first choose a set of parameters that are independent of the state for the candidate policies. For example, we define one base-stock level, regardless of the states, for every month. In this manner, we simplify the problem to one of finding the threshold points for capacity decisions at each year and the basestock levels for production decisions at each month over the 5 year horizon. Figure 11 shows the population of 50 solutions from generation 300 to generation 1000. The solutions converge to a nondominated frontier in about 1000 generations and, at the same time, remain widely spread along the frontier. We next incorporate the dependence of the parameters on a partial state space into the candidate policies, as indicated by the analytical results in section 5. We define the capacity targets as functions of the technology available, capacity levels of other resources, inventory
level, and accumulated wealth. Similarly, we define the base-stock levels as functions of the capacity levels of all resources and accumulated wealth, i.e., L L Kt,i ) Kt,i (Kt-1,(i),Tt,It0,Wt0)
(31)
H H ) Kt,i (Kt-1,(i),Tt,It0,Wt0) Kt,i
(32)
y/tτ ) y/tτ(Kt,Wtτ)
(33)
or, equivalently, in a general function form
θk ) θk(sk)
k ) 1, ..., H
(34)
We propose two alternative schemes to approximate the above functions: (1) we partition the partial state space into a few regions, in each of which the parameters are constant; (2) we use some parametric function to approximate the above functions. One type of parametric function approximation with computational advantages is a linear function
θk ) ζ0k + ζ1k s1k + ζ2k s2k + ...
(35)
where snk represents the nth state variable in function (34). We can then compute the values of parameters ζk using our simulation-based optimization approach. We segment the state space coarsely into two regions along each dimension. We define a range for the corresponding state variable and a threshold point that divides the range into “high” and “low” regions. Alternatively, we define the parameters as linear combinations of state variables and search for the optimal values of the coefficients in the linear function (35). We conduct several numerical experiments with these two approximation schemes. Figure 12 shows the comparison between the Pareto optimal frontiers found by the previous approach where parameters are state independent, indicated by 1, and two approaches in which a piecewise constant function and a linear function approximation are used, indicated by 2 and 3, respectively. We see that solutions 2 and 3 dominate solution 1. The explanation is straightforward because the search space in approximation scheme 1 is a subset of the search spaces in 2 and 3. However, because of the increase in the size of the search space and problem complexity, the algorithm usually takes more generations and a longer time to converge, e.g., after 3000
Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004 2205 Table 1. Comparison of Difference Approximation Schemes 1 2 3 a
function approximation
expected profit
expected risk
no. of param
no. of generations
computation time (min)a
independent of states piecewise constant linear combination
525.5-555.0 552.8-562.4 544.8-561.5
83.1-90.0 80.4-82.3 78.6-84.3
64 512 193
∼1000 ∼5000 ∼3000
∼20 ∼240 ∼90
Computation time is estimated on an Intel III 933 MHz computer.
Figure 13. Effect of increasing the number of scenarios.
Figure 14. Impact of the uncertainty in the arrive time of β.
generations. Table 1 briefly compares these three approaches in terms of the objective values and computational requirements, etc. Table 1 indicates that selection of the approximation scheme is itself a multi-objective problem. One has to compromise between the quality of the approximate solutions and the computational effort in obtaining the solutions. Approach 2 achieves results comparable to those of approach 3, with one dominating the other alternately in different regions. However, the computational effort required by 2 is overwhelming, even for a two-piece partition. The number of parameters, which determines the computation time, is a polynomial function of the number of partitions and can easily grow out-of-hand as we segment the state space more finely. Approach 3 finds comparably good solutions in terms of both the objective values and the diversity of solutions, and it requires only modest computation time. Therefore, for this problem we find approach 3 to be the best approximation scheme when both the quality of the solutions and the computational requirement are considered. Adding more terms, e.g., second-order terms, into the approximate function (35) would further increase the computation. 7.3. Sensitivity Analysis and Discussions. We conduct several sensitivity analyses in this subsection to gain more insight into the underlying problem and its solutions. We present several numerical experiments that demonstrate how solutions change when the number of scenarios, the probability of the arrival time of β, the variances of the demand distributions, and other factors change in the problem settings. 7.3.1. Effect of the Number of Scenarios for Demand. We increase the number of scenarios that represent the uncertain demand process from 100 in the base case to 500 and 1000. Figure 13 shows the Pareto optimal frontiers found under these three different settings. As can be seen from the comparison, the Pareto fronts found are slightly different when the number of scenarios changes (note the horizontal scale). Simulationbased optimization approach generates a finite number
Table 2. Comparison of the Probability Distributions of Technology Development case
year 1
year 2
year 3
year 4
year 5+
benchmark variation 1 variation 2
0.3 0.3 0.2
0.2 0.2 0.2
0.2 0.1 0.2
0.1 0.1 0.1
0.2 0.3 0.3
of scenarios to approximate the probability space and uses the sample average to approximate the expected value. By the law of large numbers, we have that, for a given decision strategy, the sample average converges to the corresponding expectation with probability for a sufficiently large sample size. In other words, the approximate solution will asymptotically converge to the exact optimal solution as the number of scenarios increases to infinity. We must, however, take the computational requirement into consideration. For example, the problem with 500 and 1000 scenarios takes about 500 and 1100 min to converge, respectively, compared to only 90 min taken by the problem with 100 scenarios. In this particular case, increasing the number of scenarios from 100 to 500 incurs a large extra computation time but results in only a slight change in solution. Thus, the choice of 100 scenarios seems better than the choice of 500 scenarios from both the approximation accuracy and computational points of view. This assessment and comparison is problem-dependent and requires extensive numerical experiments. Some research works have been dedicated to studying the convergence rate of simulation-based optimization and to developing statistical inference for estimation of the approximation error and validation of the optimality of computed solution.47,48 7.3.2. Impact of Uncertainty in the Arrival Time of β. We next change the probability distribution of the arrival time of the new technology β. We consider two variations to the benchmark case (Table 2), both of which assume that β will be available later than in the original case. Figure 14 shows the Pareto optimal frontiers for the three cases. As we shift the probability distribution toward the future, we find both a higher expected profit
2206 Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004
Figure 15. Impact of the variance of the demand process.
Figure 16. Effect of the capacity increment size.
and a lower expected risk. Decreasing the first year probability from 0.3 to 0.2 has a very large impact on the performances, indicating we should devote effort to assessing accurately the timing and magnitude of the technology advances. 7.3.3. Impact of the Variances of Demand Processes. Figure 15 shows the sensitivity of reducing the variance of the probability distributions for the demand process from 30% to 25% and then to 20%. Reducing the variance leads to both higher expected profit and lower expected risk, as we should expect. At the same time, the spread of the Pareto optimal front decreases. In the extreme, when the future demand is known for certain, the solution that maximizes profit would also minimize risk and the Pareto front will become a single point. 7.3.4. Effect of the Increment Size of Capacity. We assume that the firm can add or remove capacity as a multiple of a given increment size, and we examine here the sensitivity of the solution to this increment size. Figure 16 shows the Pareto fronts when the increment size is 2-4 units. Interestingly, when the increment size is 2, the Pareto front consists of two isolated pieces, one of which is close to the Pareto front found in the other cases, while the other has a significantly lower expected profit and lower expected risk. The lower piece of the Pareto front corresponds to a capacity expansion decision of 10 units, which is not feasible in the other two cases where the
Figure 17. Effect of the profit target level.
increment size is 3 or 4, respectively. Therefore, a change of the increment capacity size not only can affect the values of the Pareto optimal solutions but also can change dramatically the shape of the Pareto front. 7.3.5. Effect of the Profit Target Level. Decision makers will specify a target profit based on their understanding of the downside risk. They may, for example, consider any profit below zero, e.g., a loss, to be risky. Figure 17 shows the effect of choosing three different target levels of 0, 50, and 100. Because it is more difficult to reach higher targets, we note that the Pareto optimal front shifts toward higher expected risk as the profit target level increases. Because the profit is not uniformly distributed, the expected risk need not increase linearly with the profit target, because here the increase in the expected risk when we raise the profit target from 0 to 100 is more than twice as much as the increase in the expected risk when we raise the profit target from 0 to 50. 8. Conclusions We consider multi-objective decision making for coordinated capacity planning and inventory control problems under uncertainty. We formulate this type of problem as a Markov decision process that incorporates uncertainties in technology evolution and demand process. The decision process integrates strategic decisions, e.g., capacity expansion, each year and operational decisions, e.g., production, each month. We prove through a convexity analysis within the dynamic programming framework that target interval and base-stock policies lead to optimal tradeoff curves for the expected profit and downside risk for a class of simple problems. We use these structures to develop suboptimal, but hopefully good, control structures for more general problems, greatly reducing the general problem to one of finding policy parameters. Using a simulation-based optimization framework, we illustrate our proposed solution strategy on an example problem, for which we also carry out a sensitivity analysis. This work is a modest beginning in the optimization of structural policies for complex problems. Some important future extensions could include the investigation of the quality of the suboptimal policies compared to the exact optimal solutions. The idea of adjusting and improving the structures of the candidate polices in an outer loop is also worth further investigation.
Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004 2207
Appendix A: Proof of Theorem 1 1 Jt+1
is concave in Kt and It+1,0, Proof. Assume that 1 which is true at the last period when JN+1 is concave in 1 KN and IN+1,0 from eqs 17 and 5. Thus, Γt,M+1 is concave in Kt and It,M+1 from eq 15. Because the operating profit function vtτ given in eq 3 is concave in Kt, Itτ, and ytτ from lemma 1, then from eq 22 we can prove through 1 is concave in Kt induction that, by lemmas 1 and 2, Γtτ 1 and Itτ for all τ. In other words, Γt1 is jointly concave in Kt and It1. Because capacity adjustment revenue/cost function cC(Kt - Kt-1) given in eq 2 is jointly concave in (Kt, Kt-1), from eq 21 we can prove by lemma 2 that J1t is jointly concave in (Kt, It1). Therefore, the concavity of the objective-to-go function is preserved from t + 1 to t. The result in pqrt a then follows from a backward induction from period N + 1 to period 1. The proof of part b is similar to the proof given by Eberly and Mieghem21 in work on investment under uncertainty without considering inventory, with the exception that the thresholds KLt and KH t are now also functions of inventory level It1 in this problem. Appendix B: Proof of Theorem 2 1 Proof. As shown in the proof of theorem 1, Γtτ is jointly concave in Kt and Itτ for all τ and t. By lemma 1, the objective function for the production decision at month τ of year t
1 EDtτ[vtτ + Γt,τ+1 (Kt,ytτ-Dtτ)]
is concave in ytτ for each Kt. Let y/t (Kt) be the unique solution to eq 22 over [0, ∞). The optimal production decision is then given by a base-stock policy
ytτ - Itτ )
{
min [y/tτ(Kt) - Itτ, Kt/A] if Itτ < y/tτ(Kt) if Itτ gy/tτ(Kt) 0
where Kt/A denotes mini {Kt,i/Ai}, which represents the capacity limitation. Literature Cited (1) Scarf, H. The Optimality of (s, S) Policies in the Dynamic Inventory Problem. Mathematics Methods in the Social Sciences; Stanford University Press: Stanford, CA, 1960; pp 196-202. (2) Porteus, E. L. On the Optimality of Generalized (s, S) Policies. Manage. Sci. 1971, 17 (7), 411-426. (3) DeCroix, G. A.; Arreola-Risa, A. Optimal Production and Inventory Policy for Multiple Products under Resource Constraints. Manage. Sci. 1998, 44 (7), 950-961. (4) Kapuscinski, R.; Tayur, S. A Capacitated ProductionInventory Model with Periodic Demand. Oper. Res. 1998, 46 (6), 899-911. (5) Glasserman, P.; Tayur, S. Sensitivity Analysis for Basestock Levels in Multiechelon Production-inventory Systems. Manage. Sci. 1995, 41 (2), 263-281. (6) Fu, M. C. Sample Path Derivatives for (s, S) Inventory Systems. Oper. Res. 1994, 42 (2), 351-364. (7) Bashyam, S.; Fu, M. C.; Kaku, B. K. Application of Perturbation Analysis to Multiproduct Capacitated Production-Inventory Control. Int. J. Oper. Quant. Manage. 1998, submitted for publication. (8) Kall, P.; Wallace, S. W. Stochastic Programming; John Wiley & Sons: New York, 1994. (9) Birge, J. R.; Louveaux, F. Introduction to Stochastic Programming; Springer: New York, 1997. (10) Bertsekas, D. P. Dynamic Programming and Optimal Control; Athena Scientific: Belmont, MA, 2000; Vols. I and II.
(11) Puterman, M. L. Markov Decision Processes: Discrete Stochastic Dynamic Programming; John Wiley & Sons: New York, 1994. (12) Eppen, G. D.; Martin, R. K.; Schrage, L. A Scenario Approach to Capacity Planning. Oper. Res. 1989, 37 (4), 517-527. (13) Barahona, F.; Bermon, S.; Gunluk, O.; Hood, S. Robust Capacity Planning in Semiconductor Manufacturing; IBM Report RC22196; IBM: New York, 2001. (14) Ahmed, S.; Sahinidis, N. V. An Approximation Scheme for Stochastic Integer Programs Arising in Capacity Expansion. Oper. Res. 2002, in press. (15) Ahmed, S.; King, A. J.; Parija, G. A Multi-Stage Stochastic Integer Programming Approach for Capacity Expansion under Uncertainty. J. Global Optim. 2002, in press. (16) Ahmed, S.; Sahinidis, N. V. Robust Process Planning under Uncertainty. Ind. Eng. Chem. Res. 1998, 37 (5), 1883-1892. (17) Barbaro, A.; Bagajewicz, M. J. Managing Financial Risk in Planning under Uncertainty, Part I and II. Working paper, 2004. (18) Manne, A. S. Capacity Expansion and Probabilistic Growth. Econometrica 1961, 29 (4), 632-649. (19) Davis, M. H. A.; Dempster, M. A. H.; Sethi, S. P.; Vermes, D. Optimal Capacity Expansion under Uncertainty. Adv. Appl. Prob. 1987, 19, 156-176. (20) Bean, J. C.; Higle, J. L.; Smith, R. L. Capacity Expansion under Stochastic Demands. Oper. Res. 1992, 40 (2), S210-S216. (21) Eberly, J. C.; Van Mieghem, J. A. Multi-factor Dynamic Investment under Uncertainty. J. Econ. Theory 1997, 75, 345387. (22) Harrison, J. M.; Van Mieghem, J. A. Multi-resource Investment Strategies: Operational Hedging under Demand Uncertainty. Eur. J. Oper. Res. 1999, 113, 17-29. (23) Angelus, A.; Porteus, E. L.; Wood, S. C. Optimal Sizing and Timing of Capacity Expansions with Implications for Modular Semiconductor Wafer Fabs. Research Paper No. 1479R, Graduate School of Business, Stanford University, Stanford, CA, 1999. (24) Rajagopalan, S.; Singh, M. R.; Morton, T. E. Capacity Expansion and Replacement in Growing Markets with Uncertain Technological Breakthroughs. Manage. Sci. 1998, 44 (1), 12-30. (25) Angelus, A.; Porteus, E. L. Simultaneous Production and Capacity Management under Stochastic Demand for Produced to Stock Goods. Research Paper No. 1419R, Graduate School of Business, Stanford University, Stanford, CA, 2000. (26) Rajagopalan, S.; Swaminathan, J. M. A Coordinated Production Planning Model with Capacity Expansion and Inventory Management. Manage. Sci. 2001, 47 (11), 1562-1580. (27) Bradley, J. R.; Glynn, P. W. Managing Capacity and Inventory Jointly in Manufacturing Systems. Manage. Sci. 2002, 48 (2), 273-288. (28) Cheng, L.; Subrahmanian, E.; Westerberg, A. W. MultiObjective Decision Processes: Applications, Problem Formulations, and Solution Strategies. Manage. Sci. 2003, submitted for publication. (29) Bellman, R. E. Dynamic Programming; Princeton University Press: Princeton, NJ, 1957. (30) Li, D.; Haimes, Y. Y. Multi-objective Dynamic Programming: The State of the Art. Control Theory Adv. Technol. 1989, 5 (4), 471-483. (31) Li, D. Multiple Objectives and Non-Separability in Stochastic Dynamic Programming. Int. J. Syst. Sci. 1990, 21 (5), 933950. (32) Cheng, L.; Subrahmanian, E.; Westerberg, A. W. Design and Planning under Uncertainty: Issues on Problem Formulation and Solution. Comput. Chem. Eng. 2003, 27 (6), 781-801. (33) Heyman, D.; Sobel, M. Stochastic Models in Operations Research; McGraw-Hill: New York, 1984; Vol. II. (34) Evans, R. Inventory Control of a Multiproduct System with a Limited Production Resource. Nav. Res. Logistics Q. 1967, 14, 173-184. (35) Chankong, V.; Haimes, Y. Y. Multi-objective Decision Making Theory and Methodology; North-Holland Series in System Science and Engineering; North-Holland: Amsterdam, The Netherlands, 1983. (36) Hoyland, K.; Wallace, S. W. Generating Scenario Trees for Multistage Decision Problems. Manage. Sci. 2001, 47 (2), 295307. (37) Dupacova, J.; Consigli, G.; Wallace, S. W. Generating Scenarios for Multistage Stochastic Programs. Ann. Oper. Res. 2000, 100, 25-53.
2208 Ind. Eng. Chem. Res., Vol. 43, No. 9, 2004 (38) Cohon, J. L. Multicriteria Programming: Brief Review and Application. Design Optimization; Academic Press: New York, 1985; pp 163-191. (39) Schaffer, J. D. Some Experiments in Machine Learning Using Vector Evaluated Genetic Algorithms. Ph.D. Thesis, Vanderbilt University, Nashville, TN, 1984. (40) Srinivas, N.; Deb, K. Multi-objective Function Optimization Using Nondominated Sorting Genetic Algorithms. Evol. Comput. J. 1994, 2 (3), 221-248. (41) Fonseca, C. M.; Fleming, P. J. An Overview of Evolutionary Algorithms in Multi-objective Optimization. Evol. Comput. J. 1995, 3 (1), 1-16. (42) Horn, J.; Nafploitis, N.; Goldberg, D. A Niched Pareto Genetic Algorithm for Multi-Objective Optimization. Proceedings of the First IEEE Conference on Evolutionary Computation; IEEE: New York, 1994; pp 82-87. (43) Rudolph, G.; Agapie, A. Convergence Properties of Some Multi-Objective Evolutionary Algorithm. Proceedings of the 2000 Conference on Evolutionary Computation; IEEE Press: Piscataway, New Jersey, 2000; pp 1010-1016. (44) Hanne, T. On the Convergence of Multi-objective Evolutionary Algorithms. Eur. J. Oper. Res. 2000, 117 (3), 553-564.
(45) Deb, K.; Agrawal, S.; Pratap, A.; Meyarivan, T. A Fast Elitist Non-Dominated Sorting Genetic Algorithm for MultiObjective Optimization: NSGA-II. Proceedings of the Parallel Problem Solving from Nature VI; Springer-Verlag: New York, 2000; pp 849-858. (46) L’Ecuyer, P.; Cote, S. Implementing a Random Number Package with Splitting Facilities. ACM Trans. Math. Software 1991, 17, 98-111. (47) Shapiro A.; Homem-de-Mello, T. A Simulation-based Approach to Two-stage Stochastic Programming with Recourse. Math. Program. 1998, 81, 301-325. (48) Shapiro A.; Homem-de-Mello, T. On the Rate of Convergence of Optimal Solutions of Monte Carlo Approximations of Stochastic Programs. SIAM J. Optim. 2000, 11, 70-86.
Received for review April 21, 2003 Revised manuscript received February 6, 2004 Accepted February 23, 2004 IE0303404