Duality - Duality gap the following theorems are relevant (note: proofs are hard,,p y g , ) not...
date post
28-Jun-2020Category
Documents
view
1download
0
Embed Size (px)
Transcript of Duality - Duality gap the following theorems are relevant (note: proofs are hard,,p y g , ) not...
Duality
Nuno Vasconcelos ECE Department, UCSDp ,
Optimization goal: find maximum or minimum of a function Definition: given functions f g i=1 k and h i=1 mDefinition: given functions f, gi, i=1,...,k and hi, i=1,...m defined on some domain Ω ∈ Rn
wwf Ω∈ ),( min
iwh iwg
f
i
i
∀= ∀≤
,0)( ,0)( subject to
)(
for compactness we write g(w) ≤ 0 instead of gi(w) ≤ 0, ∀i. Similarly h(w) = 0 we derived nec. and suf. conds for (local) optimality in • the absence of constraints • equality constraints only
2
• equality constraints only • equality and inequality
Minima conditions (unconstrained) let f(w) be continuously differentiable w* is a local minimum of f(w) if and only ifw is a local minimum of f(w) if and only if • f has zero gradient at w*
0*)( =∇ wf • and the Hessian of f at w* is positive definite
0*)( =∇ wf
nt ddwfd ℜ∈∀≥∇ 0*)(2
• where
⎥ ⎤
⎢ ⎡ ∂∂ )()(
22
xfxf
ddwfd ℜ∈∀≥∇ ,0)(
⎥ ⎥ ⎥ ⎥
⎢ ⎢ ⎢ ⎢
∂∂
∂∂∂ =∇
−
)()(
)( 22
10 2 0
2
ff
x xx
x x
xf n
M
L
3
⎥ ⎥
⎦ ⎢ ⎢
⎣ ∂ ∂
∂∂ ∂
−−
)()( 2 101
x x
fx xx
f
nn
L
Maxima conditions (unconstrained) let f(w) be continuously differentiable w* is a local maximum of f(w) if and only ifw is a local maximum of f(w) if and only if • f has zero gradient at w*
0*)( =∇ wf • and the Hessian of f at w* is negative definite
0*)( =∇ wf
nt ddwfd ℜ∈∀≤∇ 0*)(2
• where
⎥ ⎤
⎢ ⎡ ∂∂ )()(
22
xfxf
ddwfd ℜ∈∀≤∇ ,0)(
⎥ ⎥ ⎥ ⎥
⎢ ⎢ ⎢ ⎢
∂∂
∂∂∂ =∇
−
)()(
)( 22
10 2 0
2
ff
x xx
x x
xf n
M
L
4
⎥ ⎥
⎦ ⎢ ⎢
⎣ ∂ ∂
∂∂ ∂
−−
)()( 2 101
x x
fx xx
f
nn
L
Constrained optimization with equality constraints only Theorem: consider the problemTheorem: consider the problem
0)()(minarg* == xhxfx x
tosubject
where the constraint gradients ∇hi(x*) are linearly independent. Then x* is a solution if and only if there exits a unique vector λ, such that
0*)(*)() =∇+∇ ∑ xhxfi i m
iλ
0*)(,0*)(*)()
0)()()
22
1
=∇∀≥⎥ ⎦
⎤ ⎢ ⎣
⎡ ∇+∇
∇+∇
∑
∑ =
yxhyyxhxfyii
xhxfi
T i
m
i T
i i
i
s.t.
λ
λ
5
1 ⎥ ⎦
⎢ ⎣
∑ =i
Alternative formulation state the conditions through the Lagrangian
m
th th b tl itt
)()(),( 1
xhxfxL i m
i i∑
=
+= λλ
the theorem can be compactly written as
)*,( )*()
* * =⎥
⎤ ⎢ ⎡∇
=∇ xL
xLi x 0 λ
λ
0*)(,0)*,()
)*,( ),()
*2
*
=∇∀≥∇
=⎥ ⎦
⎢ ⎣∇
=∇
yxhyyxLyii
xL xLi
T xx
T s.t.
0
λ
λ λ
λ
the entries of λ are referred to as Lagrange multipliers
6
Geometric view consider the tangent space to the iso-contour h(x) = 0 since h grows in any direction along which ∇h(x) is notsince h grows in any direction along which ∇h(x) is not zero, ∇h(x) ⊥ to the iso-contour hence, the subspace of first order feasible variations is, p
f ∆ f hi h ∆ ti fi th t i t t
{ }ixxhxxV Ti ∀=∆∇∆= ,0*)(|*)( space of ∆x for which x + ∆x satisfies the constraint up to first order approximation
V(x*) feasible variations
x* ∇h(x*)h(x)=0
7
Feasible variations multiplying our first Lagrangian condition by ∆x
0*)(*)( ∆∇+∆∇ ∑ xxhxxf T m
T λ
it follows that ∇f(x*) must satisfy
0*)(*)( 1
=∆∇+∆∇ ∑ =
xxhxxf i i
iλ
*)(0*)( Vf T ∆∆
i.e.∇f(x*) ⊥ V(x*) : gradient orthogonal to all feasible steps
*)(,0*)( xVxxxf T ∈∆∀=∆∇
no growth is possible along the constraint this is a generalization of ∇f(x*)=0 in unconstrained case
tnote: • Hessian constraint only defined for y in V(x*) • makes sense: we cannot move anywhere else, does not really
8
y , y matter what Hessian is outside V(x*)
Inequality constraints with inequalities
0)(0)(tosubject)(minarg* ≤== xgxhxfx
the only ones that matter are those which are active
0)( ,0)( tosubject )(minarg* ≤== xgxhxfx x
and these are equalities
{ }0)(| )( == xgjxA j innactive
*
9
x* x* active
Constrained optimization hence, the problem
0)(0)(tosubject)(minarg* ≤== xgxhxfx
is equivalent to
0)( ,0)( tosubject )(minarg* ≤== xgxhxfx x
this is a problem with equality constraints there must be
*)(,0)( ,0)( tosubject )(minarg* xAixgxhxfx i x
∈∀===
g(x) ≤ 0
this is a problem with equality constraints, there must be a λ* and µj*, such that
0*)(*)(*)( ** ∇∇∇ ∑∑ hf rm
λ ∇f
∇gwith µj* = 0, j ∉ A(x*)
0*)(*)(*)( 11
=∇+∇+∇ ∑∑ ==
xgxhxf j j
ji i
i µλ
10
finally, we need µj* ≥ 0, for all j, to guarantee this
The KKT conditions Theorem: for the problem
0)( ,0)( tosubject )(minarg* ≤== xgxhxfx
x* is a local minimum if and only if there exist λ* and µ* such that
x
0*)(*)(*)(
**
1
*
1
* xgxhxfi) j r
j ji
m
i i =∇+∇+∇
== ∑∑ µλ
0*)() *)(,0),0) **
xhiv xAjiiijii
rm
jj
⎤⎡
=
∉∀=∀≥ µµ
( )
{ }*)(,0*)(,0*)(|*)(
*)(,0)()() *1
*
1
*
xAjyxgiyxhyxVwhere
xVyyxgxhxfyv
T j
T i
xx j
r
j ji
m
i i
T
∈∀=∇∀=∇=
∈∀≥⎥ ⎦
⎤ ⎢ ⎣
⎡ ∇+∇+∇∇
=== ∑∑
and
µλ
11
{ })(,0)(,0)(|)( xAjyxgiyxhyxVwhere ji ∈∀∇∀∇ and
Geometric interpretation we consider the case without equality constraints
0)(tbj t)(i* ≤f
from the KKT conditions, the solution satisfies
0)( tosubject )(minarg* ≤= xgxfx x
[ ]
with
[ ] *)(,0) ,0)
0*)*,( ** xAjiiijii
xLi)
jj ∉∀=∀≥
=∇
µµ
µ
with
*)(*)(*)*,( 1
* xgxfxL j r
j j∑
=
+= µµ
which is equivalent to [ ] ( )[ ])(*)(min*),(minL*
**
xgxfxL T xx
+== µµ
12
*)(,0 and ,0 ** xAjjwith jj ∉∀=∀≥ µµ
Geometric interpretation [ ] ( )[ ]
*)(,0and,0
)(*)(min*),(minL* ** xAjjwith
xgxfxL
jj
T
xx
∉∀=∀≥
+==
µµ
µµ
is equivalent to • x = x* ⇒ w*Tz - b = 0
)(,0 and ,0 xAjjwith jj ∉∀∀≥ µµ
⎥ ⎤
⎢ ⎡
⎥ ⎤
⎢ ⎡ )(1
** xf
Lb • x ≠ x* ⇒ w*Tz - b ≥ 0
can be visualized as
⎥ ⎦
⎢ ⎣
=⎥ ⎦
⎢ ⎣
== )(
, *
* *, xg
zwLb µ
f ∈ R f ∈ R
w*
g(x*)=0 g(x*)
Duality we solve instead
[ ] [ ])()(min)(min)(q +== µµµ xgxfxL T ⎤⎡⎤⎡
i t ith L* l d b ( ) * b
[ ] [ ] 0
)()(min),(min)(q
≥
+==
µ
µµµ
with
xgxfxL xx
⎥ ⎦
⎤ ⎢ ⎣
⎡ =⎥
⎦
⎤ ⎢ ⎣
⎡ ==
)( )(
, 1
),( xg xf
zwqb µ
µ
same picture with L* replaced by q(µ), µ* by µ
f ∈ R f ∈ Rg(x*)=0 g(x*)
Duality note that • q(µ) ≤ L* = f* • if we keep increasing q(µ) we will get q(µ) = L* • we cannot go beyond L* (x* would move to g(x*) > 0)
this is exactly the definition of the dual problemthis is exactly the definition of the dual problem
[ ] [ ])()(min),(min)(q +== µµµ xgxfxL T xx
)(q max 0
µ µ≥
note: • q(µ) may go to -∞ for some µ.
0 ≥µwith
• this is avoided by introducing the constraint
{ }−∞>=∈ )(| µµµ qDq
15
Equality constraints so far we have disregard them. What about
0)(0)(tosubject)(minarg* ≤== xgxhxfx
intuitively, nothing should change, since
0)( ,0)( tosubject )(minarg* ≤== xgxhxfx x
i