Solutions HW #2 - WebHome - Département d'Informatique …aspremon/PDF/MVA/HW2_sol.pdf · ·...

Solutions HW #2

Dual of general LP.Find the dual function of the LP

minimize cTxsubject to Gx � h

Ax = b.

Give the dual problem, and make the implicit equality constraints explicit.Solution.

1. The Lagrangian is

L(x, λ, ν) = cTx+ λT (Gx− h) + νT (Ax− b)= (cT + λTG+ νTA)x− hλT − νT b,

which is an affine function of x. It follows that the dual function is given by

g(λ, ν) = infxL(x, λ, ν) =

{−λTh− νT b c+GTλ+AT ν = 0−∞ otherwise.

2. The dual problem ismaximize g(λ, ν)subject to λ � 0.

After making the implicit constraints explicit, we obtain

maximize −λTh− νT bsubject to c+GTλ+AT ν = 0

λ � 0.

Piecewise-linear minimization.We consider the convex piecewise-linear minimization problem

minimize maxi=1,...,m(aTi x+ bi) (1)

with variable x ∈ Rn.

1. Derive a dual problem, based on the Lagrange dual of the equivalent problem

minimize maxi=1,...,m yisubject to aTi x+ bi = yi, i = 1, . . . ,m,

with variables x ∈ Rn, y ∈ Rm.

2. Formulate the piecewise-linear minimization problem (1) as an LP, and form the dual of theLP. Relate the LP dual to the dual obtained in part (a).

1

3. Suppose we approximate the objective function in (1) by the smooth function

f0(x) = log

(m∑i=1

exp(aTi x+ bi)

),

and solve the unconstrained geometric program

minimize log(∑m

i=1 exp(aTi x+ bi)). (2)

A dual of this problem is given by (5.62). Let p?pwl and p?gp be the optimal values of (1)and (2), respectively. Show that

0 ≤ p?gp − p?pwl ≤ logm.

4. Derive similar bounds for the difference between p?pwl and the optimal value of

minimize (1/γ) log(∑m

i=1 exp(γ(aTi x+ bi))),

where γ > 0 is a parameter. What happens as we increase γ?

Solution.

1. The dual function is

g(λ) = infx,y

(max

i=1,...,myi +

m∑i=1

λi(aTi x+ bi − yi)

).

The infimum over x is finite only if∑

i λiai = 0. To minimize over y we note that

infy

(maxiyi − λT y) =

{0 λ � 0, 1Tλ = 1−∞ otherwise.

To prove this, we first note that if λ � 0, 1Tλ = 1, then

λT y =∑j

λjyj ≤∑j

λj maxiyi = max

iyi,

with equality if y = 0, so in that case

infy

(maxiyi − λT y) = 0.

If λ 6� 0, say λj < 0, then choosing yi = 0, i 6= j, and yj = −t, with t ≥ 0, and letting t go toinfinity, gives

maxiyi − λT y = 0 + tλk → −∞.

Finally, if 1Tλ 6= 1, choosing y = t1, gives

maxiyi − λT y = t(1− 1Tλ)→ −∞,

if t→∞ and 1 < 1Tλ, or if t→ −∞ and 1 > 1Tλ.

2

Summing up, we have

g(λ) =

{bTλ

∑i λiai = 0, λ � 0, 1Tλ = 1

−∞ otherwise.

The resulting dual problem ismaximize bTλsubject to ATλ = 0

1Tλ = 1λ � 0.

2. The problem is equivalent to the LP

minimize tsubject to Ax+ b � t1.

The dual problem is

maximize bT zsubject to AT z = 0, 1T z = 1, z � 0,

which is identical to the dual derived in (a).

3. First both primal problems PWL and GP are convex so strong duality holds.

Suppose z? is dual optimal for the dual GP (5.62) (with the convention 0 log 0 = 0)

maximize bT z −∑m

i=1 zi log zisubject to 1T z = 1

AT z = 0z � 0

Then z? is also feasible for the dual of the piecewise-linear formulation, with objective value

bT z = p?gp +m∑i=1

z?i log z?i .

This provides a lower bound on p?pwl:

p?pwl ≥ p?gp +m∑i=1

z?i log z?i ≥ p?gp − logm.

The bound follows from concavity of log using Jensens’s inequality

m∑i=1

zi log(1/zi) ≤ logm∑i=1

1 = logm.

On the other hand, we also have

maxi

(aTi x+ bi) ≤ log∑i

exp(aTi x+ bi)

for all x, and therefore p?pwl ≤ p?gp.

In conclusion,p?gp − logm ≤ p?pwl ≤ p?gp.

3

4. We first reformulate the problem as

minimize (1/γ) log∑m

i=1 exp(γyi)subject to Ax+ b = y.

The Lagrangian is

L(x, y, z) =1

γlog

m∑i=1

exp(γyi) + zT (Ax+ b− y).

L is bounded below as a function of x only if AT z = 0. To find the optimum over y, wecompute the conjugate function for z ∈ Rm,

f∗(z) = supyγzT y − log

m∑i=1

exp(γyi) = supy

zT y − logm∑i=1

exp(yi).

If there exists zi < 0, then, taking y = tei, we get

zT y − logm∑i=1

exp(yi) = tzi − log((m− 1) + et),

which goes to infinity for t→ −∞.

If z � 0 and 1T z 6= 1, then, taking y = t1, we get

zT y − logm∑i=1

exp(yi) = t1T z − log(m)− t,

which goes to infinity for t→ +∞ or −∞ depending on the sign of 1T z − 1.

If z � 0 and 1T z =∑

zi 6=0 zi = 1, we have using concavity of log,

zT y −m∑i=1

zi log zi =∑zi 6=0

zi(yi − log zi) ≤ log∑zi 6=0

zi exp(yi − log(zi)) ≤ logm∑i=1

exp(yi),

with equality by taking yi = log zi if zi > 0 and yi → −∞ for zi = 0. Therefore we have

f∗(z) =

m∑i=1

zi log zi.

The Lagrange dual function is then given for z � 0, 1T z = 1, AT z = 0 by

g(z) = bT z − 1

γ

m∑i=1

zi log zi,

and the dual problem is

maximize bT z − (1/γ)∑m

i=1 zi log zisubject to AT z = 0

1T z = 1.

4

Let p?gp(γ) be the optimal value of the GP. Following the same argument as above, we canconclude that

p?gp(γ)− 1

γlogm ≤ p?pwl ≤ p?gp(γ).

In other words, p?gp(γ) approaches p?pwl as γ increases.

Suboptimality of a simple covering ellipsoid.Recall the problem of determining the minimum volume ellipsoid, centered at the origin, that

contains the points a1, . . . , am ∈ Rn (problem (5.14), page 222):

minimize f0(X) = log det(X−1)subject to aTi Xai ≤ 1, i = 1, . . . ,m,

with dom f0 = Sn++. We assume that the vectors a1, . . . , am span Rn (which implies that theproblem is bounded below).

1. Show that the matrix

Xsim =

(m∑k=1

akaTk

)−1,

is feasible. Hint. Show that [ ∑mk=1 aka

Tk ai

aTi 1

]� 0,

and use Schur complements (§A.5.5) to prove that aTi Xai ≤ 1 for i = 1, . . . ,m.

Solution. [ ∑mk=1 aka

Tk ak

aTi 1

]=

[ ∑k 6=i aka

Tk 0

0 0

]+

[ai1

] [ai1

]Tis the sum of two positive semidefinite matrices, hence positive semidefinite. The Schurcomplement of the 1, 1 block of this matrix,

∑mk=1 aka

Tk which is invertible by hypothesis, is

therefore also positive semidefinite:

1− aTi

(m∑k=1

akaTk

)−1ai ≥ 0,

which is the desired conclusion.

2. Now we establish a bound on how suboptimal the feasible point Xsim is, via the dual problem,

maximize log det(∑m

i=1 λiaiaTi

)− 1Tλ+ n

subject to λ � 0,

with the implicit constraint∑m

i=1 λiaiaTi � 0. (This dual is derived on page 222.)

To derive a bound, we restrict our attention to dual variables of the form λ = t1, wheret > 0. Find (analytically) the optimal value of t, and evaluate the dual objective at this λ.Use this to prove that the volume of the ellipsoid {u | uTXsimu ≤ 1} is no more than a factor(m/n)n/2 more than the volume of the minimum volume ellipsoid.

5

Solution. The dual function evaluated at λ = t1 is

g(λ) = log det

(m∑i=1

aiaTi

)+ n log t−mt+ n.

Now we’ll maximize over t > 0 to get the best lower bound. Setting the derivative withrespect to t equal to zero yields the optimal value t = n/m. Using this λ we get the dualobjective value

g(λ) = log det

(m∑i=1

aiaTi

)+ n log(n/m).

The primal objective value for X = Xsim is given by

− log det

(m∑i=1

aiaTi

)−1,

so the duality gap associated with Xsim and λ is n log(m/n). (Recall that m ≥ n, by ourassumption that a1, . . . , am span Rn.) It follows that, in terms of the objective function,Xsim is no more than n log(m/n) suboptimal. The volume V of the ellipsoid E associatedwith the matrix X is given by V = exp(−O/2), where O is the associated objective function,O = − log detX. The bound follows.

Dual problem.Derive a dual problem for

minimize∑N

i=1 ‖Aix+ bi‖2 + (1/2)‖x− x0‖22.

The problem data are Ai ∈ Rmi×n, bi ∈ Rmi , and x0 ∈ Rn. First introduce new variables yi ∈ Rmi

and equality constraints yi = Aix+ bi.Solution. The Lagrangian is

L(x, z1, . . . , zN ) =N∑i=1

‖yi‖2 +1

2‖x− x0‖22 +

N∑i=1

zTi (yi −Aix− bi).

We first minimize over yi. We have

infyi

(‖yi‖2 + zTi yi) =

{0 ‖zi‖2 ≤ 1−∞ otherwise.

(If ‖zi‖2 > 1, choose yi = −tzi and let t → ∞, to show that the function is unbounded below. If‖zi‖2 ≤ 1, it follows from the Cauchy-Schwarz inequality that ‖yi‖2 + zTi yi ≥ 0, so the minimum isreached when yi = 0.)

We can minimize over x by setting the gradient with respect to x equal to zero. This yields

x = x0 +

N∑i=1

ATi z.

Substituting in the Lagrangian gives the dual function

g(z1, . . . , zN ) =

{−∑N

i=1(Aix0 + bi)T zi − 1

2‖∑N

i=1ATi zi‖22 ‖zi‖2 ≤ 1, i = 1, . . . , N

−∞ otherwise.

6

The dual problem is

maximize∑N

i=1(Aix0 + bi)T zi − 1

2‖∑N

i=1ATi zi‖2

subject to ‖zi‖2 ≤ 1, i = 1, . . . , N.

Lagrangian relaxation of Boolean LP.A Boolean linear program is an optimization problem of the form

minimize cTxsubject to Ax � b

xi ∈ {0, 1}, i = 1, . . . , n,

and is, in general, very difficult to solve. In exercise (4.15) we studied the LP relaxation of thisproblem,


0 ≤ xi ≤ 1, i = 1, . . . , n,(3)

which is far easier to solve, and gives a lower bound on the optimal value of the Boolean LP. In thisproblem we derive another lower bound for the Boolean LP, and work out the relation between thetwo lower bounds.

1. Lagrangian relaxation. The Boolean LP can be reformulated as the problem


xi(1− xi) = 0, i = 1, . . . , n,

which has quadratic equality constraints. Find the Lagrange dual of this problem. Theoptimal value of the dual problem (which is convex) gives a lower bound on the optimalvalue of the Boolean LP. This method of finding a lower bound on the optimal value is calledLagrangian relaxation.

2. Show that the lower bound obtained via Lagrangian relaxation, and via the LP relaxation (3),are the same. Hint. Derive the dual of the LP relaxation (3).

Solution.

1. The Lagrangian is

L(x, µ, ν) = cTx+ µT (Ax− b)− νTx+ xT diag(ν)x

= xT diag(ν)x+ (c+ATµ− ν)Tx− bTµ.

Minimizing over x gives the dual function

g(µ, ν) =

{−bTµ− (1/4)

∑ni=1(ci + aTi µ− νi)2/νi ν � 0

−∞ otherwise

where ai is the ith column of A, and we adopt the convention that a2/0 = ∞ if a 6= 0, anda2/0 = 0 if a = 0.

7

The resulting dual problem is

maximize −bTµ− (1/4)∑n

i=1(ci + aTi µ− νi)2/νisubject to ν � 0.

In order to simplify this dual, we optimize analytically over ν, by noting that

supνi≥0

(−(ci + aTi µ− νi)2

νi

)=

{4(ci + aTi µ) ci + aTi µ ≤ 00 ci + aTi µ ≥ 0

= 4 min{0, (ci + aTi µ)}

This allows us to eliminate ν from the dual problem, and simplify it as

maximize −bTµ+∑n

i=1 min{0, ci + aTi µ}subject to µ � 0.

2. We follow the hint. The Lagrangian and dual function of the LP relaxation are

L(x, u, v, w) = cTx+ uT (Ax− b)− vTx+ wT (x− 1)

= (c+ATu− v + w)Tx− bTu− 1Tw

g(u, v, w) =

{−bTu− 1Tw ATu− v + w + c = 0−∞ otherwise.

The dual problem ismaximize −bTu− 1Twsubject to ATu− v + w + c = 0

u � 0, v � 0, w � 0,

which is equivalent to the Lagrange relaxation problem derived above. We conclude that thetwo relaxations give the same value.

8

Solutions HW #2 - WebHome - Département d'Informatique …aspremon/PDF/MVA/HW2_sol.pdf · ·...

Documents

Transcript of Solutions HW #2 - WebHome - Département d'Informatique …aspremon/PDF/MVA/HW2_sol.pdf · ·...