of 29

• date post

08-Mar-2021
• Category

## Documents

• view

0
• download

0

Embed Size (px)

### Transcript of optimization - 2009. 10. 6.آ  Optimization note: maximizing f(x) is the same as minimizing...

• Optimization

Nuno Vasconcelos ECE Department, UCSDp ,

• Optimization many engineering problems boil down to optimization goal: find maximum or minimum of a functiongoal: find maximum or minimum of a function Definition: given functions f, gi, i=1,...,k and hi, i=1,...m defined on some domain Ω ∈ Rn

iwg wwf

i ∀≤ Ω∈

,0)( subject to ),( min

f(w): cost; hi (equality), gi (inequality): constraints

iwhi ∀= ,0)(

for compactness we write g(w) ≤ 0 instead of gi(w) ≤ 0, ∀i. Similarly h(w) = 0

2

note that f(w) ≥ 0 ⇔ –f(w) ≤ 0 (no need for ≥ 0)

• Optimization note: maximizing f(x) is the same as minimizing –f(x), this definition also works for maximization the feasible region is the region where f(.) is defined and all constraints hold

{ } w* is a global minimum of f(w) if

{ }0)(,0)(| =≤Ω∈=ℜ xhwgw

w* is a local minimum of f(w) if

Ω∈∀≥ wwfwf *),()(

w is a local minimum of f(w) if local

global*)()( * s.t. 0

ff ww

⇒∃ εε

3

global*)()( wfwf ≥

• The gradient the gradient of a function f(w) at z is

T ff ⎞⎛ ∂∂

Th th di t i t i

n

z w

fz w fzf ⎟⎟

⎞ ⎜⎜ ⎝

⎛ ∂

∂ ∂ ∂

=∇ −

)(,),()( 10

L

Theorem: the gradient points in the direction of maximum growth proof:

∇f

proof: • from Taylor series expansion

)()()()( 2ααα Owfdwfdwf T +∇+=+ • derivative along d

))(,cos(.)(.)()()(lim 0

wfdwfdwfdwfdwf T ∇∇=∇=−+ → α

α α

(*)

4

• is maximum when d is in the direction of the gradient

• max The gradient

max

note that if ∇f = 0 • there is no direction of growthg • also -∇f = 0, and there is no direction of

decrease • we are either at a local minimum or maximumwe are either at a local minimum or maximum

or “saddle” point

conversely, at local min or max or saddle point

min point • no direction of growth or decrease • ∇f = 0

saddle this shows that we have a critical point if and only if ∇f = 0 to determine which type we need second

5

to determine which type we need second order conditions

• maxThe Hessian if ∇f = 0, by Taylor series

)()( α wfdwf =+

and )()(

2 )(

)()(

32 2

0

ααα

α

Odwfdwfd

wfdwf

TT +∇+∇+

+

321

i

and

T

)()( 2 1)()( 2

2 αα α Odwfdwfdwf T +∇=−+

minpick α such that O(α)

• Minima conditions (unconstrained) let f(w) be continuously differentiable w* is a local minimum of f(w) if and only ifw is a local minimum of f(w) if and only if • f has zero gradient at w*

0*)( =∇ wf

• and the Hessian of f at w* is positive definite

0)( =∇ wf

nt ddwfd ℜ∈∀≥∇ 0*)(2

• where ⎤⎡ ∂∂ 22 ff

ddwfd ℜ∈∀≥∇ ,0)(

⎥ ⎥ ⎥ ⎥ ⎤

⎢ ⎢ ⎢ ⎢ ⎡

∂∂

∂∂ ∂

∂ ∂

=∇ −

)()(

)( 22

10 2 0

2

ff

x xx fx

x f

xf n

M

L

7

⎥ ⎥

⎦ ⎢ ⎢

⎣ ∂ ∂

∂∂ ∂

−−

)()( 2 1

2

01

2

x x

fx xx

f

nn

L

• Maxima conditions (unconstrained) let f(w) be continuously differentiable w* is a local maximum of f(w) if and only ifw is a local maximum of f(w) if and only if • f has zero gradient at w*

0*)( =∇ wf

• and the Hessian of f at w* is negative definite

0)( =∇ wf

nt ddwfd ℜ∈∀≤∇ 0*)(2

• where ⎤⎡ ∂∂ 22 ff

ddwfd ℜ∈∀≤∇ ,0)(

⎥ ⎥ ⎥ ⎥ ⎤

⎢ ⎢ ⎢ ⎢ ⎡

∂∂

∂∂ ∂

∂ ∂

=∇ −

)()(

)( 22

10 2 0

2

ff

x xx fx

x f

xf n

M

L

8

⎥ ⎥

⎦ ⎢ ⎢

⎣ ∂ ∂

∂∂ ∂

−−

)()( 2 1

2

01

2

x x

fx xx

f

nn

L

• Example consider the functions

f(x) = x1 + x2 g(x) = x12 + x22( ) 1 2 g( ) 1 2 the gradients are

⎤⎡ 12x⎤⎡1

f has no minima or maxima

⎥ ⎦

⎤ ⎢ ⎣

⎡ =∇

2

1

2 2

)( x x

xg⎥ ⎦

⎤ ⎢ ⎣

⎡ =∇

1 1

)(xf

f has no minima or maxima, g has a critical point at the origin x = (0,0) since Hessian is positive definite this is a minimumsince Hessian is positive definite, this is a minimum

⎥ ⎦

⎤ ⎢ ⎣

⎡ =∇

20 02

)(2 xg

9

⎦⎣ 20

• Example makes sense because

f(x) = x1 + x2( ) 1 2 is a plane, gradient is constant

⎥ ⎦

⎤ ⎢ ⎣

⎡ =∇

1 1

)(xf

iso-contours of f(x)of f(x)

x1 + x2 = 0

x1 + x2 = 1

10

x1 + x2 0

x1 + x2 = -1

• Example makes sense because

g(x) = x12 + x22g( ) 1 2 is a quadratic, positive everywhere but the origin note how gradient points towards largest increase g p g

g(x)=2 2

⎥ ⎦

⎤ ⎢ ⎣

⎡ =∇

2 0

)(xh ⎥ ⎦

⎤ ⎢ ⎣

⎡ =∇

2 2

)(xh

⎥ ⎦

⎤ ⎢ ⎣

⎡ =∇

2

1

2 2

)( x x

xh

1

1

⎤⎡22

g(x)=1 1

⎥ ⎦

⎤ ⎢ ⎣

⎡ =∇

0 2

)(xh

11

• Convex functions Definition: f(w) is convex if∀w,u ∈ Ω and λ ∈ [0,1]

)()1()())1(( fff λλλλ ≤

Theorem: f(w) is convex if and only if its Hessian is iti d fi it f ll

)()1()())1(( ufwfuwf λλλλ −+≤−+

positive definite for all w

Ω∈∀≥∇ wwwfw t ,0)(2 f(w)

λf(w)+(1-λ)f(v)

proof: • requires some

intermediate res lts that

( ) f(u)

intermediate results that we will not cover

• we will skip it u

w f(λw+(1-λ)v)

12

λw+(1-λ)v

• Concave functions Definition: f(w) is concave if∀w,u ∈ Ω and λ ∈ [0,1]

)()1()())1(( fff λλλλ ≥

Theorem: f(w) is concave if and only if its Hessian is ti d fi it f ll

)()1()())1(( ufwfuwf λλλλ −+≥−+

negative definite for all w

Ω∈∀≤∇ wwwfw t ,0)(2

proof: • -f(w) is convex • by previous theorem, Hessian

is negative definite • Hessian of f(w) is positive definite

13

( ) p

• Convex functions Theorem: if f(w) is convex any local minimum w* is also a global minimumg Proof: • we need to show that, for any u, f(w*) ≤ f(u) • for any u: ||w*-[λw*+(1-λ)u|| = (1-λ) ||w*-u|| • and, making λ arbitrarily close to 1, we can make

||w*-[λw*+(1-λ)u|| ≤ ε, for any ε > 0

• since w* is local minimum, it follows that f(w*) ≤ f(λw*+(1-λ)u) and by convexity that f(w*) ≤ λf(w*)+(1-λ)f(u)and, by convexity, that f(w ) ≤ λf(w ) (1 λ)f(u)

• or f(w*)(1-λ) ≤ f(u)(1-λ) • and f(w*) ≤ f(u)

14

• Constrained optimization in summary: • we know what are conditions for unconstrained max and minwe know what are conditions for unconstrained max and min • we like convex functions (find a minima, it will be global minimum)

what about optimization with constraints?p a few definitions to start with inequality gi(w) ≤ 0:q y gi( ) • is active if gi(w) = 0, otherwise inactive

inequalities can be expressed as equalities by introduction of slack variables

0 and ,0)( 0)( ≥=+⇔≤ iiii wgwg ξξ

15

• Convex optimization Definition: a set Ω is convex if ∀w,u ∈ Ω and λ ∈ [0,1] then λw+(1-λ)u ∈ Ω( ) “a line between any two points in Ω is also in Ω”

convex not convex

Definition: an optimization problem where the set Ω, the cost f and all constraints g and h are convex is said to be convexconvex note: linear constraints g(x) = Ax+b are always convex (zero Hessian)

16

( )

• Constrained optimization we will consider general (not only convex) constrained optimization problems, start by case with only equalitiesp p , y y q Theorem: consider the problem

0)( subject to )(minarg* == xhxfx

where the constraint gradients hi(x*) are linearly independent. Then x* is a solution if and only if there

)(j)(g f x

exits a unique vector λ, such that

0*)(*)() =∇+∇ ∑ xhxfi i m

0*)( s.t. ,0*)(*)( )

0)()( )

22

1

=∇∀≥⎥ ⎦

⎤ ⎢ ⎣

⎡ ∇+∇

∇+∇

∑ =

yxhyyxhxfyii

xhxfi

T i

m

i T

i i

i

λ

λ

17

1 ⎥ ⎦

⎢ ⎣

∑ =i