Optimization of Nonlinear Functions Subject to Equality Constraints

Optimization of Nonlinear Functions Subject to Equality Constraints. Judicious Use of Elementary Calculus and Random Numbers. Rein Luus, and Taina H. ...
0 downloads 0 Views 407KB Size
Optimization of Nonlinear Functions Subject to Equality Constraints. Judicious Use of Elementary Calculus and Random Numbers Rein Luus* and Taina H. I. Jaakola Department of Chemical Engineering, University of Toronto, Toronto, Canada

The numerical solution procedure of a basic problem of mathematical programming frequently encountered by chemical engineers i s considered. The problem consists of minimizing (or maximizing) a function f(x1, x 2 , x,) subject to a set of equality constraints & ( x l , x2, x,) = 0,j = 1 , m with m 6 n. By using elementary calculus, the problem can be reduced to solving a set of nonlinear algebraic equations. Since the convergence of the standard solution procedure i s dependent on the choice of “good” starting values much uncertainty and indecision can be removed simply by using random numbers over a range. The effectiveness of using random starting conditions to solve algebraic equations i s tested with various examples,

...,

...,

I n the recent literature numerous papers have appeared on the numerical solution of constrained minimization (or maximization) problems. AIiele, et a2. (1972), proposed the introduction of penalty functions and a modified quasilinearization algorithm to solve the resulting problem. Loonkar and Robinson (1970) suggested the use of second-degree Lagrangian polynomial to fit the last three function iterations and to calculate the root of the polynomial which is then used as a next guess. The illustration of the procedure by Loonkar and Robinson with the problem of minimizing the capital investment for batch processes yielded an answer close to the optimum. Their answer mas improved by Hillinckx and Rijckaert (1971) by using a logarithmic transformation and then by solving the problem by geomet.ric programming. There are numerous other examples where procedures requiring sophist’icated computer programming are presented to solve t’he type of optimization problems which are frequently encountered by chemical engineers. *kt this stage one is ready to ask what’ has happened to the straightforward elementary calculus approach which is taught to engineering students in their sophomore years. Is that approach obsolete or is it simply overlooked by some computer programming enthusiasts? The purpose of this paper is to show that the basic approach using calculus can be applied directly t,o many problems. Also, the use of Lagrange multipliers is still an efficient way of solving problems where the equality constraints are of a complicated nature, provided that the starting conditions are sufficiently good. The latter can be ensured by taking a set of starting conditions (let us say 100) a t random but within the range of interest. The aim of this paper is to outline and test the procedure which we have found very useful, so that chemical engineers will have access to a n easily understandable and effective approach to a class of optimization problems.

...,

It is assumed thatf and @r are in class C1 (i.e., possess first and second derivatives and the derivatives are continuous), and that the constrained minimum exists. Solution Procedure

Case 1. m = n. When the number of constraint equations is equal t o the number of variables and if all the constraint equations are independent of each other, no minimization can be carried out. The problem simply reduces to solving the constraint equations. We illustrate this with the optimization of a continuous through-circulation dryer in example 1. Case 2. m < n. (a) Simple Constraint Equations. If the constraints are sufficiently simple, one variable can be removed for each constraint equabion and the problem reduces to minimizing a function of (n - m) variables. This case is illustrated with the problem of minimizing of capital investment in example 2. (b) Nonlinear Constraint Equations. If the solution of constraint equations is not immediately apparent and if m < n, introduce the augmented performance index I = f(x1,

,

.

,,2,)

(1)

, ,

380

0 j = 1, 2, . . ., m

Ind. Eng. Chem. Process Des. Develop., Vol. 1 2 , No. 3, 1973

(2)

Xj&j(2l, 2 2 ,

, , ,

, 2,)

(3)

j=1

1,2, . . , , n

=

(4)

+

+

yi

and the (m

=

xi i j

I, 2, . . ., n

= =

1, 2, . ., m ,

+ n) dimensional vector z such that zi

=

m

+

Equations 4 and 2 are m + n algebraic equations in the m n unknowns 21, 2 2 , . , . , x,, XI, X2, . . . , .A , To solve eq 4 and 2, introduce the (m n) dimensional vector y such that

yn+j = X,

subject bo the constraints

@,(zl, r2, . ,2),

., 2,)

dI/dxi = 0 i

Consider the problem of niinimizing 22,

, ,

where X j are the Lagrange multipliers. Then the necessary conditions for the minimum of f are that

Problem Definition

c = f(Xl,

22,

=

z,+,

dI/dxi i = @i

=

1, 2, . . ., n

j = 1,2, . . ., m

(5)

The solution to the problem lies in choosing the appropriate value for y such that

Z(Y)

=

0

(7)

In both case 1 and case 2 we confront the problem of solving a set of algebraic equations. Let us illustrate the procedure which may be used to solve eq 7. The reader can adapt the procedure directly to ciLse 1 and case 2a where no Lagrange mutipliers are used. To obtain a value for y such that eq 7 is satisfied usually requires the use of a numerical procedure. Given a value for y for which eq 7 is not satisfied, it is then required to calculate a change 6y such that z(y 6y) will be as close to zero as possible. To the first approximation

+

Z(Y

+ 6Y)

=

Z(Y)

=:

P

=

0.0064z1[1 - e ~ p ( - O . l 8 4 x ~ ~ ~ ~ z ~ )(12) ]

subject to the power constraint (3000

+

Z I ) X I ~ ~= Z

1.2 X l O l a

(13)

'

and the moisture content distribution constraint exp(0.184;21~~~z~) = 4.1

+ (bZT/3Y)6Y

(8)

where T denotes the transpose, and a natural choice is

6y

sideration of the physical problem shows that for maximum production rate the motors for the circulating blowers will be operating a t the maximum allowed level and that the moisture content constraint is tight. Hence the inequality constraints may be replaced by equality constraints and the problem is to maximize P given by

- (dzT/dy)-'z(y)

(9)

(14)

This problem falls into the case where the number of equalities equals the number of variables. There is thus no degrees of freedom left for optimization. Solving the two equality constraints

I n eq 9 it is necessary t o invert the matrix

(3000

+

Z I ) X I ~ Z Z=

1.2 X l O l a

(15)

In 4.1

(16)

and This can be readily done if the matrix is nonsingular, as is the case with many engineering systems. Note that the sequence in which the variables have been defined in eq 5 and 6 ensure that the (n m) dimensional matrix given by eq 10 is symmetric. By numerical procedure eq 7 can only be satisfied approximately. A convenient criterion to end iterations is to specify that

0.18421~

=

presents no problem. By dividing, we get

+

(3000

+

21)21= ~ , 0.184 ~

x

1.2

x

1013/ln 4.1

(17)

By using eq 9 in scalar form, eq 17 can be solved easily by the iterative procedure %'(1+1)

=

21(j)

-

i=l

where E is taken to be a small number, for example, 10-6 or 1.0 depending on the problem. Also, it should be noted that eq 4 is only a necessary condition to be satisfied a t a stationary point such as a minimum or a maximum. Therefore it is essential that various starting conditions be used to ensure that a proper answer is obtained. The numerical examples, illustrate this point. To start the iteratiw procedure we suggest taking a large number of initial conditions for yi, i = 1, 2, . . . , m n, a t random over a given range, for example, 100 different sets of initial conditions for a typical problem. Then the resulting solutions can be readily examined for the proper answer. The digital computers which are now available are so fast that a typical problem may be solved in this manner in less than 1 min of computer time. I t is only when this approach does not yield success, that we suggest using other approaches. It is not difficult to construct problems where difficulties can arise in inverting the matrix, so we do not claim that all moblems can be solved by this approach. However, we do suggest that this simple procedure be attempted before more complicated means are adopted. We provide various examples to show the effectiveness of this approach in giving the answer with no appreciable effort.

+

Thus, having ZI the maximum production rate given by eq 12 can be calculated from

P

=

0.006421(3.1/4.1)

(19)

Starting with x1 = 25,000, in three iterations by using eq 18 we obtained convergence to 21 = 31,766, P = 153.71, and z2 = 0.342. These results are identical with those given by Thygeson and Grossmann (1970). This example shows the ease with which results can be obtained to a physical problem because of insight one has to the process of drying, and also illustrates the ease of handling the problem when the number of constraints equals the number of variables. Example 2. Minimization of Capital Investment

We consider next the problem of minimizing the capital investment for batch processes as considered by Loonkar and Robinson (1970) and Hellinckx and Rijckaert (1971). The problem is to choose xl, xz,and z3to minimize

c = 592V0 + 582VO 65

25O(v/%z)O40

'*+ 3 7 o ( v / 2 1 ) ~ f

f 1200v0

22

+ 2 1 0 ( V / ~ z ) ~+ 25O(v/~g)O + 62

40

200(v/~a)0~85 (20) subject to the simple constraint

Example 1. Optimization of a Continuous Through-Circulation Dryer

Consider the continuous through-circulation dryer which was modeled and optimized by Thygeson and Grossmann (1970). The effective diameter of the particles is fixed and it is necessary to estimate the fluid velocity 2 1 and the bed depth xz which will maximize the production rate, P. X little con-

V

=

50(10

+ ZI + xz +

23)

(21)

To solve this problem eq 21 can be used directly as the definition of v and the stationary conditions are

bC acdv ax, d V d x ,

+ -ac =o dx,

i=l,2,3

Ind. Eng. Chem. Process Des. Develop., Vol. 12, No. 3, 1973

(22) 381

Table 1. Solutions Satisfying the Convergence Criteria from 100 Initial Conditions Taken at Random in the Range 0.001 6 x d 6 1.000 Initial values

Final values

x1

x2

X.3

x1

0,081 0.038 0.073 0.002 0,005 0,008 0.054 0,081 0.023 0.025 0.034 0.050 0.070

0.457 0.744 0.541 0.010 0.989 0,982 0.646 0.464 0.884 0.830 0.814 0.695 0.520

0.846 0,809 0.251 0.947 0.129 0.171 0.530 0.783 0.836 0.207 0.340 0.159 0.412

0.11114 0.11114 0.11114 0.11114 0.11114 0.11114 0.11114 0.11114 0.11114 0.11114 0.11114 0.11114 0.11114

X2

1.46175 1.46175 1.46175 1.46175 1.46174 1.46174 1.46175 1.46174 1.46175 1.46174 1.46175 1.46174 1.46175

Now that thc above procedure is used to solve eq 22, the differentiation can be carried out very easily to give eq 2 2 . Then the (3 x 3) dimensional matrix to be inverted is

X.3

C

Number of iterations

3.42477 3.42476 3.42476 3.42476 3.42476 3.42476 3.42476 3.42476 3.42476 3.42476 3.42476 3.42476 3.42476

126302.9 126303.0 126302.9 126302.9 126303.0 126303.0 126302.9 126302.9 126303.0 126303.0 126302.9 126302.9 126302.9

166 152 163 226 151 149 157 165 146 152 151 158 163

The augmented performance index is

so that eq 5 becomes

and can be obtained quite easily, and there is no difficulty in inverting the matrix. To carry out the numerical solution, we chose the range of xt to be from 0.001 to 1.000 and assigned 100 sets of random numbers in this range. If any of the xibecame negative during the calculational procedure, divergence was indicated and the next set was read in. The maximum number of iterations that was allowed was taken as 250 and the convergence criterion e was taken as 1.0. Out of the 100 sets of xi,i = 1, 2 , 3, it was discovered that 13 of them converged to the same value of C. The results are given in Table I. The total computation time on IBM 370/ 165 was 9 sec. When the range of the random numbers was reduced to lie in the range 0.001-0.200, convergence was obtained 24 times (to the same values as in Table I) in a computation time of 19 sec. This computation time is slightly more than twice that of the previous case. On the average, about 190 iterations mere required to reach the optimum as compared to about 157 iterations for the previous case. Example 3. Geometrical Problem

Consider the geometrical problem of determining the maximum distance from the origin to the intersection of the plane XI

+ 2x2 + 3x3 = 1

(24)

and the ellipsoid 222

212+

-

2

+4

232 -

=

4

TOavoid unnecessary complexity, consider the square of the distance. The problem is thus to find the maximum value of P as defined by P

=

+ + 222

212

(26)

23'

subject to

382

+1(x1,

22, 23)

= x1

$2(21,

2 2 , 23)

=

+ 2x2 + 3x3 - 1 = 0 + 2 +-4 0 4

212

x22 -

232

=

Ind. Eng. Chem. Process Des. Develop., Vol. 12, No. 3, 1973

(27)

Yf]

r

and eq 6 may be written as

z =

+ + + + + +

2Yl y4 2YIY5 2Y2 2Y4 y2ys 2y3 f 3y4 f 0.5y3ys y1 2Y2 3Y3 - 1 yi2 o.5yz2 f 0.25~3' - 4

+

The iteration scheme given by eq 9 is

where bzT -=[o

dy

2 -k 2Ys 0 O 20 + Y s

0 0

1 2Yl

3

2 Y2

0 0 0 0

which is a symmetric matrix, as expected. The stopping condition for iterations %as taken as e = lo+, or when 100 iterations had been used. The results with 100 initial conditions for yz, i = 1 , 2 , , . . , 5 chosen from random numbers in the range from -5.00 to 4.99, are given in Table 11. It is noted that the true maximum is obtained 11 times, and the minimum 31 times. Also, 57 times we obtained either a local maximum or a local minimum, and only once divergence resulted. The existence of four stationary conditions is easy to visualize when the intersection of the plane with the ellipsoid is pictured as a n elliptical hoop in space. Example 4. Test cases of Miele, et a/.(1972)

The seven examples chosen by lliele, et al. (1972), to test their modified quasilinearization algorithm were written into a single program and run with 100 different initial conditions chosen a t random in the range from -5.00 to 4.99. For these seven examples the number of times the proper mmi-

Table II. Solutions Obtained to Example 3 P

x1

x2

x3

A1

A2

Frequency

4.049 4.430 9.051 9.995 Divergence

1.993 -1.933 0.157 -0.183

-0.201 0.604 2.622 -2.430

-0.197 0.575 -1.467 2.014

0.098 -0.280 0.412 -0.526

-1.025 -1.073 -2.314 -2.432

31

mum was obtained wa'j 100, 48, 9, 34, 82, 65, and 54 times, respectively. Total computation time for the 700 runs was 1 min. The problems used in this example are given in the Appendix. Conclusions

Considerable computational experience, some of which we have conveyed in thio paper, indicates that the standard met.hod of solving nonlinear programming problems where equality constraints are encountered can be made effective if the initial conditions are chosen a t random in some range. We have not found much difference whether random numbers be assigned to all variables or only to the real variables and zeros given initially to Lagrange multipliers. hbove all, we find that the difficulty of solving a set of nonlinear algebraic equations has been greatly overestimated by many. Due to the ease of carrying out the calculations, we recommend the use of numerous starting conditions which are chosen a t random. Then the answer can be obtained by selecting the proper numerical solution.

subject to the constraints

+ sin xz +

21224

The minimum of the function isf = 0.24149 at 21 = 1.16616, 2 2 = 1.18209, 2 3 = 1.38026, 5 4 = 1.50603, and 2 5 = 0.61093. Problem 4. Minimize

f=

(XI

- 112

Sample Problems Used in Example 4. T h e seven sample problems from the paper of hliele, et al. (1972)) and their solutions are as follows. Problem 1. Minimiice (21

- 22)2

+

(52

-t 2 3 - 212

+

(24

-

112

+

(25

23

+

2 4

- 2x5

xz - 2 5

=

=

- 112

+

(21

0

+

222)

(22

- 2314

= (21

-

112

+

(21

- x4)4

+

(24

-4

4

XI 22

+ + -2 - 342 = 0 - + +2 -2 4 = 0 22'

23'

24

232

x1x5

-2

0

=

The minimum of the function isf = 0.07877 at 2, = 1.19112, 1.36259, x3 = 1.47281, x4 = 1.63498, and x5 = 1.67908. Problem 5. Minimize

x2 =

1)2

+

- 212)2

(52

subject to the constraint x1

+ +1= 0 232

The minimum of the function is f = 0.04 at 21 2 2 = 1.00000, and 2 3 = 0.00000. Problem 6. Minimize f = -

=

- 1.00000,

$1

subject to the constraints x2 - x13 -

x82 =

- 22 - 2 4 2

=

0 0 =

- 22

subject to the constraint (1

+

212y

+

222

-4 =0

The minimum of the function isf = and x2 = 1,73205.

- 1.73205 a t 21 = 0.00000

literature Cited

+

x34

-4-3 4

Hillinckx, L. J., Rijckaert, M. J., Ind. Eng. Chem., Proc. Des. De=

velop., 10, 422 (1971).

0

Loonkar, Y. R., Robinson, J. D.. Ind. Ena. Chem.. Proc. Des. De-

The minimum of the function isf = 0.032567 a t X I = 1.10486, xz = 1.19667, and x3 = 1.53526. Problem 3. Minimiz,e

s

+

subject to the constraints

f = log (1 + 212)

subject to the constraint

+

- 2312 (23

0

- x2)2

(2,

The minimum of the function is f = -1.00000 at x1 1.00000, 2 2 = 1.00000, 2 3 = 0.00000, and 2 4 = 0.00000. Problem 7. Minimize

The minimum of the function is f = 4.09298 a t x1 = -0.76744, 2 2 = 0.25581, 2 3 = 0.62791, 2 4 = -0.11628, and 2 5 = 0.25581. Problem 2. Minimize

21(1

+

- 22)2

212

+ 322 = 0

Xi

(21

(21

- 1)2

subject to the constraints

f=

+

f = O.Ol(x1 -

Appendix

f=

- x5) - 2 4 = o -8-42 =0

(24

~3%4'

Acknowledgment

This work was performed with the assistance of a grant from the National Research Council of Canada, No. A-3515. Computations were carried out with the facilities of the University of Toronto Computer Centre.

30 27 11 1

- x2)2

+

(23

- 11% + (2, - 114

velop.,'Q, 625'(1970).

'

Miele, A., Moseley, P. E., Levy, A. V., Coggins, G. M., J . Optim. Theory Appl., 10, 1 (1972).

Thygeson, J. R., Jr., Grossmann, E. D., AIChE J., 16, 749 (1970).

+ (z5-

116

RECEIVED for review January 2, 1973 ACCEPTEDMarch 29, 1973 Ind. Eng. Chem. Process Des. Develop., Vol. 12, No. 3, 1 9 7 3

383