Lecture 5: Inequality Constrained Optimizationprivatewangt/Presession Math/Lecture 5.pdf · Lecture...

Lecture 5: Inequality Constrained Optimization Tianxi Wang

[email protected]

4.2 The Effect of Nonnegativity Restrictions

• First consider a problem with nonnegativity restrictions on the choice

variable x1 but with no other constraints.

max π = f(x1) s.t. x1 ≥ 0

where f is a differentiable function.

• In view of the restriction x1 ≥ 0 three possible solutions may arise.

-

6

-x1 x1

6

Figure 1 Figure 2

B

0

f(x1)

0

A

K f(x1)

-

6

x1

Figure 30

f(x1)

f(x1)

C

D

3

• Interior solution: dπ/dx1 = f ′(x1) = 0 and x1 > 0 (point A in Figure

1).

• Boundary solution: x1 = 0 and f ′(x1) = 0 (point B in Figure 2)

• Boundary solution: x1 = 0 and f ′(x1) < 0 (a point like C or D in Figure

3).

• Therefore to qualify as a local maximum the candidate point merely has

to be higher than the neighboring points within the feasible set.

• Consequently, for a value of x1 to give a local maximum of π it must

satisfy the following three conditions:

(1) f ′(x1) ≤ 0;

(2) x1 ≥ 0;

(3) x1f′(x1) = 0.

• Note that the conditions automatically exclude a point like K in Figure

1 which is not a maximum, because f ′(x1) > 0.

• Condition (3) is referred to as the complementary slackness condition

i.e. at least one of the two quantities x1 and f ′(x1) must have a zero

value so that the product of the two must be zero.

• Taken together equations (1) to (3) constitute the first-order necessary

conditions for a maximum.

4

Section 4.3 The Effect of Inequality Constraints

• Now consider a problem that includes inequality constraints as well.

• Suppose we have a maximization problem with three choice variables

x1, x2, x3 and two constraints:

max π = f(x1, x2, x3)

s.t. g1(x1, x2, x3) ≤ r1

g2(x1, x2, x3) ≤ r2

and x1, x2, x3 ≥ 0

• We can convert the inequalities of the constraints by defining slack vari-

ables s1, s2:

s1 = r1 − g1(x1, x2, x3)

s2 = r2 − g2(x1, x2, x3)

• Therefore the original problem can be transformed into the equivalent

form:

max π = f(x1, x2, x3)

s.t. g1(x1, x2, x3) + s1 = r1

g2(x1, x2, x3) + s2 = r2

and x1, x2, x3, s1, s2 ≥ 0

5

• If the nonnegativity restrictions are absent we can form the Lagrange

function Z:

Z = f(x1, x2, x3)+λ1[r1−g1(x1, x2, x3)−s1]+λ2[r2−g2(x1, x2, x3)−s2]

and write the first-order condition as:

∂Z

∂x1

=∂Z

∂x2

=∂Z

∂x3

=∂Z

∂s1

=∂Z

∂s2

=∂Z

∂λ1

=∂Z

∂λ2

= 0

• But since x and s variables do have to be nonnegative the F.O.C.s for

these variables need to be modified:

∂Z

∂xj

≤ 0; xj ≥ 0; xj

∂Z

∂xj

= 0

∂Z

∂si

≤ 0; si ≥ 0; si

∂Z

∂si

= 0

∂Z

∂λi

= 0

where i = 1, 2 and j = 1, 2, 3.

• Inasmuch as ∂Z/∂si = −λi we can eliminate the slack variables from

the F.O.Cs to obtain:

si ≥ 0; λi ≥ 0; siλi = 0

• Since si = ri − gi(x1, x2, x3) we get:

ri − gi(x1, x2, x3) ≥ 0; λi ≥ 0; λi[ri − gi(x1, x2, x3)] = 0

6

• Therefore we can express the F.O.Cs in an equivalent form without the

slack variables:

∂Z

∂xj

= fj − (λ1gj1+ λ2g

j2) ≤ 0; xj ≥ 0; xj

∂Z

∂xj

= 0 (1)

ri − gi(x1, x2, x3) ≥ 0; λi ≥ 0; λi[ri − gi(x1, x2, x3)] = 0

where gj

i denotes ∂gi/∂xj.

• For a constraint to bind requires that its Lagrange Multiplier to be strictly

positive. For a constraint to be non-binding, its Lagrange multiplier

equals zero.

• This is one version of the Kuhn-Tucker conditions for this problem.

7

Section 4.4 Kuhn-Tucker Conditions

• We can obtain the same set of conditions set out in equation (1) more

directly by setting up the Lagrange function without using slack variables

Definition 1 (Kuhn-Tucker conditions) To derive the Kuhn-Tucker con-

ditions for solution of the problem

max π = f(x1, x2, x3)

s.t. r1 − g1(x1, x2, x3) ≥ 0

r2 − g2(x1, x2, x3) ≥ 0

and x1, x2, x3 ≥ 0

where all functions are concave and differentiable, we first form the

Lagrange function

L(x1, x2, x3, λ1, λ2) = f(x1, x2, x3) + λ1[r1 − g1(x1, x2, x3)]

+λ2[r2 − g2(x1, x2, x3)]

and then maximize with respect to the variables x1, x2 and x3 subject

to the nonnegativity restrictions x1, x2, x3 ≥ 0

∂L

∂xj

= fj − (λ1gj1+ λ2g

j2) ≤ 0; xj ≥ 0; xj

∂L

∂xj

= 0

and minimize with respect to the variables λ1 and λ2 subject to the

nonnegativity restrictions λ1, λ2 ≥ 0

∂L

∂λi

= ri − gi(x1, x2, x3) ≥ 0; λi ≥ 0; λi

∂L

∂λi

= 0

8

• Notes:

1. To derive the Kuhn-Tucker conditions the constraint(s) is always ex-

pressed as greater than or equal to zero. Unlike classical programming,

the order of subtraction is important in concave programming.

e.g. For less than or equal to constraints in maximization problems,

max f(x, y) s.t. g(x, y) ≤ B

subtract the variables in the constraint from the constant of the con-

straint

max f(x, y) s.t. B − g(x, y) ≥ 0

and therefore the Lagrange function is written as:

L = f(x, y) + λ[B − g(x, y)]

2. For minimization problems convert the problem into a maximization

problem by multiplying the objective function by -1. For the corre-

sponding greater than or equal to constraints in minimization prob-

lems subtract the constant of the constraint from the variables in the

constraint.

e.g.

min f(x, y) s.t. g(x, y) ≥ B

max−f(x, y) s.t. g(x, y) − B ≥ 0

and therefore the Lagrange function is written as:

L = −f(x, y) + λ[g(x, y) − B]

9

Theorem 1 (Kuhn-Tucker Theorem) Given the problem:

max f(x1, ..., xn)

subject to:

g1(x1, ..., xn) ≥ 0, ..., gm(x1, ..., xn) ≥ 0

and

x1, ..., xn ≥ 0

if all functions f and gj, j = 1, ..., m are concave and differentiable and

if Slater’s condition holds, that is if there exists a point (x0

1, ..., x0

n) such

that gj(x0

1, ..., x0

n) > 0, all j = 1, ..., m then there exists m Lagrange

Multiplier λ∗j such that the Kuhn-Tucker conditions are both necessary

and sufficient for the point (x∗1, ..., x∗

n) to be a solution to the problem.

Section 4.5 Example

• Consider the following linear-programming problem:

max U = 4S + D

s.t. S + D ≤ 10

S + 2D ≤ 12

S, D ≥ 0

• Since a linear function is concave and convex, though not strictly concave

or strictly convex, a concave-programming problem consisting solely of

linear functions that meet the Kuhn-Tucker conditions, will always satisfy

the necessary and sufficient condition for a maximum.

10

• To solve a concave programming problem, first derive the Kuhn-Tucker

conditions and then through trial and error find if the guess leads us to

a solution or to a contradiction that informs us to try something else.

• Either start by assuming one of the constraints to be non-binding since

the related Lagrange multiplier will be zero by complementary slackness,

thereby eliminating a variable.

• Or start by trying a zero value for a choice variable since this simplifies

the marginal conditions.

• In this example the Lagrange function is:

L = 4S + D + λ[10 − S − D] + µ[12 − S − 2D]

• The feasible set is depicted below

-

6

12

10

D

S

6 10

A

B1

1

1λ ≥ 0, µ = 0

λ, µ > 0

λ = 0, µ ≥ 0

N

Feasible set

11

and the Kuhn-Tucker conditions are:

S∗∂L

∂S= 0; S∗

≥ 0;∂L

∂S≤ 0;

D∗ ∂L

∂D= 0; D∗

≥ 0;∂L

∂D≤ 0;

λ∗∂L

∂λ= 0; λ∗

≥ 0;∂L

∂λ≥ 0;

µ∗∂L

∂µ= 0; µ∗

≥ 0;∂L

∂µ≥ 0;

Since:∂L

∂S= 4 − λ − µ

∂L

∂D= 1 − λ − 2µ

∂L

∂λ= 10 − S − D

∂L

∂µ= 12 − S − 2D

we obtain:

S∗[4 − λ − µ] = 0; S∗≥ 0; 4 − λ − µ ≤ 0;

D∗[1 − λ − 2µ] = 0; D∗≥ 0; 1 − λ − 2µ ≤ 0;

λ∗[10 − S − D] = 0; λ∗≥ 0; 10 − S − D ≥ 0;

µ∗[12 − S − 2D] = 0; µ∗≥ 0; 12 − S − 2D ≥ 0;

12

• Let’s guess that µ > 0 and λ > 0 i.e. both constraints are binding

(point A in the above diagram). This implies that:

10 = S + D

12 = S + 2D

• Solving for S and D we obtain that S∗ = 8 > 0 and D∗ = 2 > 0. This

implies that:

4 − λ − µ = 0

1 − λ − 2µ = 0

in order to satisfy the complementary slackness conditions.

• Solving for λ and µ we get that λ∗ = 7 > 0 and µ∗ = −3 < 0 which

contradicts our initial guess that µ > 0!

• Next guess: Let’s now guess that µ = 0 and λ > 0. This implies that

10 = S + D (2)

12 > S + 2D (3)

• Try S > 0 and D = 0 for the choice variables (point B in the above

diagram). From equation (2) this implies that S∗ = 10. it is clear to see

that equation (3) is also satisfied when S = 10 and D = 0. This also

implies that

4 − λ − µ = 0 (4)

1 − λ − 2µ < 0 (5)

13

• Since we guessed µ = 0 this implies from equation (4) that λ∗ = 4 > 0

and this implies that equation (5) is satisfied since −3 < 0.

Therefore we have satisfied all the Kuhn-Tucker conditions for this max-

imization problem and the solution D = 0, S = 10, λ = 4, µ = 0 is a

maximum.

14

Lecture 5: Inequality Constrained Optimizationprivatewangt/Presession Math/Lecture 5.pdf · Lecture...

Documents

Transcript of Lecture 5: Inequality Constrained Optimizationprivatewangt/Presession Math/Lecture 5.pdf · Lecture...