INTEGER PROGRAMMING, DUALITY and SUPERADDITIVE …INTEGER PROGRAMMING, DUALITY and SUPERADDITIVE...

Post on 10-Jul-2020

37 views 0 download

Transcript of INTEGER PROGRAMMING, DUALITY and SUPERADDITIVE …INTEGER PROGRAMMING, DUALITY and SUPERADDITIVE...

LABORATOIRE d’ANALYSE et d’ARCHITECTURE des SYSTEMES

INTEGER PROGRAMMING, DUALITY and

SUPERADDITIVE FUNCTIONS

Jean B. LASSERRE

IMA workshop, July 2005

1

The integer program

maxx c′ x | A x = b; x ∈ Nn

with A = [A1, . . . , An] ∈ Zm×n, b ∈ Zm has a dual problem

D : minf∈Γ f(b) | f(Aj) ≥ cj, j = 1, . . . , n

where Γ is the space of functions f : Rm → R that are superad-

ditive (f(a + b) ≥ f(a) + f(b)), and with f(0) = 0. (Jeroslow,

Wolsey, etc ...)

Elegant but rather abstract and not very practical; however, used

to derive valid inequalities. Moreover, the celebrated Gomory

cuts used in Solvers (CPLEX, XPRESS-MP, ..) can be inter-

preted as such superadditive functions.

2

In addition, the optimization problem

D : minf∈Γ f(b) | f(Aj) ≥ cj, j = 1, . . . , n

is too rich, and somehow, a rephrasing of the problem: Indeed,

the set Γ contains in particular the optimal value function f :

Zm → R ∪ −∞

b 7→ f(b) := maxx c′x | Ax = b; x ∈ Nn

Similarly, the standard LP problem maxxc′x |A x = b; x ≥ 0 has

an abstract dual problem

DL : minf∈∆

f(b) | f(Aj) ≥ cj, j = 1, . . . , n

where ∆ is now the set fo functions f : Rm → R ∪ −∞ that

are concave (or even concave piecewise linear).

3

But the simpler linear program

D∗L : minf linear

f(b) | f(Aj) ≥ cj, j = 1, . . . , n

that is,

D∗L : minλ b′λ | A′j λ ≥ cj, j = 1, . . . , n

is a valid dual ... for a fixed value of b. It does not use concave

piecewise linear functions (but does not contain the optimal value

function b 7→ f(b) := maxx≥0 c′x |A x = b).

Hence, for integer programs, one should also obtain a dual sim-

pler than the superadditive dual ....

4

CONTINUOUS OPTIM. DISCRETE OPTIM.− | −

f(b, c) := max c′x

s.t.

[Ax ≤ b

x ∈ Rn+

←→

fd(b, c) := max c′x

s.t.

[Ax ≤ b

x ∈ Nn

l − lINTEGRATION SUMMATION

− | −f(b, c) :=

∫Ω

ec′xdx

Ω :=

[Ax ≤ b

x ∈ Rn+

←→fd(b, c) :=

∑Ω

ec′x

Ω :=

[Ax ≤ b

x ∈ Nn

5

ef(b,c) = limr→∞ f(b, rc)1/r; efd(b,c) = lim

r→∞ fd(b, rc)1/r.

or, equivalently

f(b, c) = limr→∞

1

rln f(b, rc); fd(b, c) = lim

r→∞1

rln fd(b, rc).

6

CONTINUOUS OPTIM. DISCRETE OPTIM.− | −

Legendre-FenchelDuality ←→ ??

l − l

INTEGRATION SUMMATION− | −

Laplace-Transform Z-TransformDuality ←→ Duality

7

Legendre-Fenchel duality : f : Rn → R convex; f∗ : Rn → R.

λ 7→ f∗(λ) = F(f)(λ) := supyλ′y − f(y).

(One-sided) Laplace-Transform: f : Rn+ → R; F : Cn → C.

λ 7→ F (λ) = L(f)(λ) :=∫Rn

+

e−λ′y f(y) dy.

(One-sided) Z-Transform: f : Zn+ → R; F : Cn → C.

λ 7→ F (z) = Z(f)(z) :=∑

m∈Zn+

z−mf(m).

8

with Ω := x ∈ Rn | A x ≤ y; x ≥ 0

Fenchel-duality: Laplace-duality

f(y, c) := maxx∈Ω

c′x f(y, c) :=∫Ω

ec′x dx

f∗(λ, c) := minyλ′y − f(y, c) F (λ, c) :=

∫e−λy f(y, c) dy

=∫x≥0

ec′x[∫

Ax≤ye−λy dy

]dx

=1

m∏j=1

λj

n∏k=1

(A′λ− c)k

with : A′λ− c ≥ 0; λ ≥ 0 with : <(A′λ− c) > 0; <(λ) > 09

In standard form, with Ω := x ∈ Rn | A x = b; x ≥ 0

F (λ, c) =1

n∏k=1

(A′λ− c)k

and one retrieves f(y, c) by

f(y, c) =∫Γ

e y′ λ F (λ, c) dλ =∫Γ

e y′ λ∏nj=1(A

′ λ− c)jdλ

Integration by Cauchy’s residue techniques →• The (multidimensional) poles λ of F solve πσ Aσ = cσ for all

bases Aσ of the continuous LP, ... which yields ....

10

Brion and Vergne ’s continuous formula

Terminology of LP in standard form:

Let Aσ := [Aσ1| . . . |Aσm] be a basis of maxc′x|Ax = b;x ≥ 0,with x(σ) the corresponding vertex, πσ := c′σA−1

σ the associated

dual variable and the reduced cost vector ck − πσAk, k 6∈ σ.

Then :

f(b, c) =∑

x(σ): vertex of Ω(b)

ec′x(σ)

det(Aσ)∏k 6∈σ

(−ck + πσAk)

from which it easily follows that

log[lim

r→∞ f(b, rc)]1/r

= maxx(σ) : vertex of Ω(b)

c′ x(σ).

11

Discrete Z-duality

Let Ω(y) := x ∈ Rn | A x = y; x ≥ 0.

fd(y, c) :=∑

x∈Ω(y)∩Zn

ec′x

Then the generating function or Z-transform of fd reads

z 7→ Fd(z, c) :=∑

y∈Zm

z−y fd(y, c),

and, by simple algebra

Fd(z, c) =n∏

j=1

1

1− ecj z−Aj(=

n∏j=1

1

1− ecj z1−A1j . . . zm

−Amj)

with |zAj | > ecj ∀j = 1, . . . , n.

12

To compute fd(y, c) it then suffices to invert the generating func-

tion, that is:

fd(y, c) =∫|z|=γ

zy−1Fd(z, c) dz

which can be done by repeated application of Cauchy’ s residue

Theorem, which in turn requires computing the poles of the

generating function Fd.

• The Poles of Fd are also associated with the bases of the LP

max c′ x |A x = y, x ≥ 0 !!

13

Let σ := [Aσ1| · · · |Aσm] be a feasible basis of the LP

maxxc′ x |A x = y, x ≥ 0, with µ(σ) := det(Aσ), and associated

“dual” variables πσ, which are solutions of πAσ = cσ.

Indeed, each basis σ provides µ(σ) complex poles z = eλ in

Cm, which are solutions of the polynomial equations eA′σλ = ecσ.

(Compare with πσAσ = cσ)

The µ(σ) solutions z = eλ are of the form

λ = πσ + 2iπv

µ(σ); v ∈ Vσ ⊂ Zm

Vσ = v ∈ Zm | v′Aσ = 0 mod µ(σ)

14

Brion and Vergne ’s discrete formula

Re-interpreted with these data, Brion and Vergne ’s original dis-

crete formula reads (Lasserre)

fd(y, c) =∑

x(σ): vertex of Ω(y)

ec′ x(σ) ×

1

µ(σ)

v∈Vσ

e2iπv′ y/µ(σ)∏k 6∈σ

(1− e−(2iπv′Ak/µ(σ)) e(ck−πσAk)

)

15

Back to optimization : fd(b, c) = max c′x |A x = b; x ∈ Nn.

Theorem . Assume that

maxσ

ec′x(σ) × limr→∞

v∈Vσ

e2iπv′b/µ(σ)∏k 6∈σ

(1− e−(2iπv′Ak/µ(σ)) er(ck−πσAk))

1/r

is attained at a unique basis σ∗. Then :

fd(b, c) = c′ x(σ∗) +∑

k 6∈σ∗(ck − πσ∗Ak)x∗k =

∑j

cj x∗j .

σ∗ is an optimal basis of the linear program, and x(σ∗) (resp. x∗)is an optimal solution of the linear (resp. integer) program.

16

In this case :

fd(b, c) = c′x(σ∗) +

max∑

k 6∈σ∗(ck − πσ∗Ak)xk

Aσ∗ u +∑

k 6∈σ∗Ak xk = b

u ∈ Zm; xk ∈ N ∀k 6∈ σ∗

fd(b, c) = c′x(σ∗) + ρ := opt. value of GOMORY relaxation!

17

limr→∞

v∈Vσ

e2iπv′b/µ(σ)∏k 6∈σ

(1− e−(2iπv′Ak/µ(σ)) er(ck−πσAk))

1/r

is the same as detecting the leading term uρ of the rational

fraction

u 7→

v∈Vσ

e2iπv′b/µ(σ)∏k 6∈σ

(1− e−(2iπv′Ak/µ(σ)) u(ck−πσAk))

as u→∞.

18

With u := er, fd(y, rc) is the rational fraction

u 7→∑

x(σ): vertex of Ω(y)

uc′ x(σ) ×

1

µ(σ)

v∈Vσ

e2iπv′ y/µ(σ)∏k 6∈σ

(1− e−(2iπv′Ak/µ(σ)) u(ck−πσAk)

)

=∑

bases σ of Ω(y)

gσ(u)

When Gomory relaxation is not exact, the leading term as u→∞is not unique and cancellations occur.

19

A Discrete Farkas Lemma

Let A ∈ Nm×n, b ∈ Nm and consider the problem deciding whether

or not A x = b has a solution x ∈ Nn, or, equivalently, deciding

whether or not fd(b, c) ≥ 1.

Theorem: (i) A x = b has a solution x ∈ Nn if and only if the

polynomial b 7→ zb − 1 in R[z1, . . . , zm] can be written

zb − 1 =n∑

j=1

Qj(z)(zAj − 1) (=

n∑j=1

Qj(z)(zA1j1 . . . z

Amjm − 1)

for some polynomials z 7→ Qj(z) with nonnegative coefficients.

(ii) The degree of the Qj’s is bounded bym∑

j=1

bj −maxk

m∑j=1

Ajk.

20

A single LP to solve with n×(b∗+m

b∗)

variables,(b∗+m

m

)constraints

and a (sparse) matrix of coefficients in 0,±1

One also retrieves the classical Farkas Lemma in Rn, that is,

x ∈ Rn |A x = b;x ≥ 0 6= ∅ ⇔[A′ u ≥ 0 ⇒ b′ u ≥ 0

]Indeed, if A x = b has a solution x ∈ Nn, then with u = ln z,

eb′u − 1 =n∑

j=1

Qj(eu1, . . . , eum)(e(A′u)j − 1).

Therefore,

A′u ≥ 0 ⇒ e(A′ u)j − 1 ≥ 0 ⇒ eb′u − 1 ≥ 0 ⇒ b′u ≥ 0,

and so A x = b has a nonnegative solution x ∈ Rn+.

21

The general case A ∈ Zm×n, b ∈ Zm.

Let Ω := x ∈ Rn|Ax = b; x ≥ 0 be a polytope.

Let α ∈ Nn be such that for every column Aj of A,

Akj+αj≥ 0 ∀ k = 1, . . . , m; let N 3 β ≥ ρ(α) := maxα′x |x ∈ Ω.

Theorem: (i) Ax = b has a solution x ∈ Nn if and only if thepolynomial z 7→ zb(zy)β − 1 in R[z1, . . . , zm] can be written

zb(zy)β − 1 = Q0(z, y)(zy − 1) +n∑

j=1

Qj(z, y)(zAj(zy)αj − 1),

for some polynomials Qj(z), all with nonnegative coefficients.

(ii) The degree of the Qj’s is bounded by b∗ := (m+1)β+m∑

j=1

bj.

22

Back to standard Farkas lemma

x ∈ Rn |A x = b; x ≥ 0 6= ∅ ⇔[A′λ ≥ 0⇒ b′λ ≥ 0

]

But, equivalently x ∈ Rn |A x = b, x ≥ 0 6= ∅ if and only if thepolynomial λ 7→ b′λ can be written

b′ λ =n∑

j=1

Qj(λ)(A′ λ)j,

for some polynomials Qj ⊂ R[λ1, . . . , λm], all with nonnegativecoefficients.

In this case, each Qj is necessarily a constant, that is, Qj ≡Qj(0) = xj ≥ 0, and A x = b!

23

P = x ∈ Rn |Ax = b, x ≥ 0 P ∩ Zn

x ∈ P x ∈ integer hull (P)⇔ x = Q(0, . . . ,0) with ⇔ x = Q(1, . . . ,1) with

Q ∈ R[λ1, . . . , λm] Q ∈ R[eλ1, . . . , eλm]

b′λ = 〈Q, A′ λ〉 eb′ λ − 1 = 〈Q, eA′ λ − 1n〉

Q 0 Q 0

Comparing continuous and discrete Farkas lemma

An equivalent Linear program

Let 0 ≤ q = qjα ∈ Rns be the coefficients of the Qj’s in

zb − 1 =n∑

j=1

Qj(z) (zAj − 1)

They are solutions of a linear system

M q = r, q ≥ 0

for some matrix M and vector r, both with 0,±1 coefficients.

** M and r are easily obtained from A, b with no computation

Write q = (q1, q2, . . . , qn) with each qj = qjα ∈ Rs, and let

cjα := cj for all α

25

Theorem : Let A ∈ Nm×n, b ∈ Nm, c ∈ Rn.

(i) The integer program P → maxx c′x | A x = b, x ∈ Nn has

same value as the linear program

Q→ maxq

n∑j=1

c′j qj | M q = r; q ≥ 0.

(ii) Let q∗ be an optimal vertex, and let

x∗j :=∑α

q∗jα = Qj(1) j = 1, . . . , n.

Then x∗ ∈ Nn and x∗ is an optimal solution of P.

26

The link with superadditive functions

The LP-dual Q∗ of the linear program Q reads

Q∗ → minππ′ r | M′ π ≥ c.

More precisely, with D :=∏n

j=10,1, . . . , bj ⊂ Nm,

Q∗ →

min

ππ(b)− π(0)

s.t. π(Aj + α)− π(α) ≥ cj, α + Aj ∈ D, j = 1, . . . , n

Let Π := π : D → R, and for every π ∈ Π, let fπ : D → R be

the function

fπ(x) := infx+α∈D

π(x + α)− π(α), x ∈ D

27

For every π ∈ Π, the function fπ is superadditive and fπ(0) = 0.

With Q∗ one may then associate the optimization problem

S∗ →

minπ∈Π

fπ(b)

s.t. fπ(Aj) ≥ cj, j = 1, . . . , n.

Thus, S∗ is a simplified and explicit form of the abstract super-

additive dual, as we only consider finite superadditive functions

f : D → R, instead of f : Nm → R ∪ −∞.

It is the analogue for IP with A x = b of Wolsey’s dual for IP with

A x ≤ b. Note that Q∗ is simpler than S∗.

28

Moreover Q∗ is simpler than S∗ as in the LP Q∗, the functionπ : D → R is NOT required to be super additive!! In S∗ on hasto write the O(|D|2) superadditivity constraints

π(x + α)− π(α) ≥ π(x), x, α ∈ D, with x + α ∈ D,

and the n additional constraints π(Aj) ≥ cj, j = 1, . . . , n.

On the other hand, in Q∗ one only has the n O(|D|) constraints

π(Aj + α)− π(α) ≥ cj α ∈ D, with Aj + α ∈ D,

For instance, in the unbounded knapsack problem

maxxc′ x |

n∑j=1

aj xj = b; x ∈ Nn

One has n + b2/2 constraints in S∗, and nb−∑n

j=1 aj in Q∗.

With P = x ∈ Rn+ |A x = b, the integer hull co(P ∩ Zn) reads

co(P ∩ Zn) = x ∈ Rn | −n∑

j=1

λ∗j xj ≤ b′π∗,

for finitely many (π∗, λ∗), generators of the convex cone Ω ⊂R|D| ×Rn

Ω :=(π, λ) ∈ R|D| ×Rn | π(α + Aj)− π(α) + λj ≥ 0,

α + Aj ∈ D, j = 1, . . . n

Equivalently, again with

x 7→ fπ∗(x) := infα+x∈D

π∗(x + α)− π∗(α), x ∈ D

co(P ∩ Zn) = x ∈ Rn |n∑

j=1

fπ∗(Aj)xj ≤ fπ∗(b),

29

CONCLUSION

Generating functions permit to exhibit a natural duality for

integer programming, an IP-analogue of LP duality.

This duality permits to simplify the abstract superadditive dual,

and so, might help providing efficient Gomory cuts in MIP

solvers like CPLEX, or XPRESS-MP.

30

LasserreJ.B. (2005). Integer Programming, duality and super-

additive functions, Cont. Math 474, pp. 138-150

Lasserre J.B. (2004). Integer Programming Duality. in: Trends

in Optimization. Proc. Symp. Appl. Math. 61, pp. 67–83

Lasserre J.B. (2004). The integer hull of a convex rational poly-

tope. Discr. Comp. Geom 32, pp. 129–139.

Lasserre J.B. (2004). Generating functions and duality for inte-

ger programs. Discrete Optim. 1, pp. 167–187.

Lasserre J.B. (2004). A discrete Farkas lemma. Discrete Opti-

mization 1, pp. 67-75. .

Another dual problem

Let A ∈ Zm×n, b ∈ Zm, c ∈ Rn. Let y 7→ f(y, c) = maxc′x |Ax =y; x ≥ 0. The Fenchel transform (−f)∗ of the convex function−f(., c) is the convex function

λ 7→ (−f)∗(λ, c) = supy∈Rm

λ′y + f(y, c).

The dual problem of the linear program is obtained from Fenchelduality as

f(b, c) = infλ∈Rm

b′ λ + (−f)∗(−λ, c)

= infλ∈Rm

b′ λ + supx∈Rm

+

(c−A′λ)′x = minλb′ λ |A′ λ ≥ c

Equivalently

ef(b,c) = infλ∈Rm

supx∈Rm

+

e(b−A x)′λec′ x

32

Define

ρ∗ := infz∈Cm

supx∈Nn

<(zb−A x ec′ x

)= inf

z∈Cmf∗d(z, c).

Hence,

f∗d(z, c) = <

zbn∏

j=1

(z−Ajecj)xj

< ∞ if |zAj | ≥ ecj ∀j

(that is, A′ ln |z| ≥ c). Next, (writing z ∈ C as eλeiθ)

ρ∗ ≤ infz∈Rm

supx∈Rn

+

<(zb−Ax ec′x

)= inf

λ∈Rmsup

x∈Rn+

e(b−Ax)′λ ec′x = ef(b,c)

Finally, with z ∈ Cm arbitrary fixed

supx∈Nn

<(zb−Ax ec′x

)≥ ec′x∗ = efd(b,c)

Hence fd(b, c) ≤ ln ρ∗ ≤ f(b, c).

33

Theorem: Let σ∗ be an optimal basis of the linear program.

Under uniqueness of the “maxσ” in Brion and Vergne ’s formula,

and an additional technical condition

efd(b,c) = ρ∗ = maxx∈Nn

<(zb−A xec′ x

)= f∗d(z, c)

where zAj = γ ecj ∀j ∈ σ∗ for some real γ > 1.

z is an optimal solution of the dual problem

infz∈Cm

f∗d(z, c)

34