APPLICATIONS OF SUBOPTIMALITY IN DYNAMIC PROGRAMMING

Note: In lieu of an abstract, this is the article's first page. Click to increase image size Free first page. View: PDF. Related Content ...
0 downloads 0 Views 7MB Size
OPERATIONS RESEARCH SYMPOSIUM

he discovery of the principle of optimality, as well as T its first application, resulted from studiesof the operation of hydroelectric reservoirs (6). I t led to the recurrence method of optimization, generalized, and considerably developed by Bellman, who gave it the name dynamic programming. A r i s ( I ) , Roberts (8), Nemhauser (7), and Wilde ( I I ) , in particular, have studied its applications to the problems of optimization in chemical engineering. A chemical plant often consists of equipment connected in series. For each item "a" of the plant it may be possible to derive the cost functionjr(xu q,), whae x, is the decision variable, the establishment of which defines the size of component i for the production capacity, q,. The total optimized cost of the installation will be of the form

F(x1. . . x , .

. .X",

61. . .q,. ..q") =

c j,(x,,

. .XI,

4,.

1

Of

Suboptimality in Dynamic Programming L. THlRlET A. DELEDICQ

i-I

q,)

I-1

I t can be determined by various methods, in particular the dynamic programming method. The object of economic optimization in chemical engineering is to determine this function

F(x1. . .XI.

Applications

..q, . . .q")

for definite values of qn, the planned production levels. The cost functions j,(xu q,) must not only cover the capital expenditure needed for the construction and commissioning of the plant, but also the operating costs required by actual operation. These operating cmts are inescapably bound up with the development of the technical and economic situation up to the horizon (variation in costs of materials, labor, productivity. .); the time factor is therefore an essential part of the optimization of a single item of chemical engineering plant, as well as of that of a complete works, or a program of works, and the influence of time is essentially the same. Once assumptions have been made as to the life of each piece of equipment and the development of the cost situation up to the horizon in question, it is possible to optimize the processes. This optimization, although well defined, is limited to the determination of the optimum configuration of the parameters characterizing the production plant with absolute precision.

.

The concept of suboptimality, which considers a series of less acceptable alternatives to optimum conditions, combined

with information on variables not used in formulating the model leads to optimum location and construction of a nuclear fuel reprocessing plant

V O L 6 0 NO. 2 F E B R U A R Y 1 9 6 8

23

However, such oversimplification is rarely acceptame in real decision problems. It is regrettable and often dangerous to ignore not only political or other aspects of the problems, but even certain indirect economic effects linked with the choice of the size of the items of plant. These additional decision elements are usually taken into account intuitively and seldom quantitatively at the moment of choice. Therefore quantification and inclusion of them into the optimization model, at least as regards certain indirect economic dects, represent progress in the sense of the determination of a more total optimum. This paper shows how to take account, in the optimization, of such indirect economic effects. The proposed method is based on the concept of suboptimality and its application to processes which can be represented by sequential graphs. First we recall how to represent a dynamic program by a sequential graph. We then describe a new algorithm for determining suboptimal policies in a sequential graph. Finally we apply the concept of suboptimality to take into account indirect effectslinked with alternate policies to the problem of decision making under uncertainty and to problems of optimum plant location.

DEFINITION AND DETERMINATION O F T H E SUBOPTIMUM POLICJES I N A SEQUENTIAL GRAPH Suboptimum Policy in a Sequential Omph

An optimization problem can be represented by a sequential graph and consequently is amenable to dynamic programming if, and only if, the function to be optimized can be put in the form ofa finite sum of terms, each term being a function of a decision variable which can assume a finite number of discrete values. Figure 1 shows such a sequential graph. Let 0, 1, 2, 3, . . .n. N, be the moments when the state of the process is being considered and when a decision has to be taken. At the time n, the state of the process is marked by a vertex

.

x.'[j

= 1, 2,

. . .,PWI

+

from which it is possible at time n 1 to be in one of the states of a set of new vertices rx.' reachable directly from x,,'. The decision taken at time n consists of choosing the state x.+l'el'x.'. The cost of this decision is written V(x.', x,,+.+13. A policy is a set of vertices XO, ~1'1, . . .x,.,'~ which mark the successive movements of the proces from the time 0 to the time N . The objective is to determine a policy allowing one to go from the initial state xo to a final state x d , while minimizing the total cost C:

+

+ .. . +

xN'T

c

~ ( X O Xi'1) ,

24

INDUSTRIAL AND ENGINEERING CHEMISTRY

V(Xi'1,

xs")

v(XN-l'N-l,

4' \

--I

N

0

F i p e 1. Rcprescnfah saqudntial graph

, dimer

_I q,timization,

I

ma

We assume a single initial state XO; if it were not so, it would be sufficient to introduce a state x-1 connected to all the states XO' by zero cost arcs. We write

For the value of suboptimum policy of rank k (there are k - 1 better policies than this suboptimum policy). Similarly fn.N(k)(xJ) is the value of the kth best policy from x,,'. Algorithms for Determination of SubopIimum Policies

Bellman and Kahba's Algorithm consists of a systematic exploration of all possible policies from all the vertices of the graph. The i-optimum policy from x.' is the ith of the policies which consist in choosing a vertex x,,+.lWx,,' and from this vertex following an m-optimum policy such that m i. We then calculate by recurrence from the vertices x.' to the vertex xo the 1, 2, . . .k-optimum policies from all the vertices of the graph. Kaufmann and Cruon's Algorithm (5) is as follows: Assume the 1-optimum policy from x.' to be found and assume that it consists in first choosing x.+1-" and following the 1-optimum policy from the vertex. Then the 2-optimum policy from x.' is the best of the following policies: either choose x n + P and follow the 2-optimum policy from this vertex, or choose xI+le differing from X"+I-" and follow the 1-optimum policy from this vertex. The number of policies to be compared at each stage of calculation is much lower than that of the previous