INTEGER PROGRAMMING, DUALITY and SUPERADDITIVE …INTEGER PROGRAMMING, DUALITY and SUPERADDITIVE...
Transcript of INTEGER PROGRAMMING, DUALITY and SUPERADDITIVE …INTEGER PROGRAMMING, DUALITY and SUPERADDITIVE...
LABORATOIRE d’ANALYSE et d’ARCHITECTURE des SYSTEMES
INTEGER PROGRAMMING, DUALITY and
SUPERADDITIVE FUNCTIONS
Jean B. LASSERRE
IMA workshop, July 2005
1
The integer program
maxx c′ x | A x = b; x ∈ Nn
with A = [A1, . . . , An] ∈ Zm×n, b ∈ Zm has a dual problem
D : minf∈Γ f(b) | f(Aj) ≥ cj, j = 1, . . . , n
where Γ is the space of functions f : Rm → R that are superad-
ditive (f(a + b) ≥ f(a) + f(b)), and with f(0) = 0. (Jeroslow,
Wolsey, etc ...)
Elegant but rather abstract and not very practical; however, used
to derive valid inequalities. Moreover, the celebrated Gomory
cuts used in Solvers (CPLEX, XPRESS-MP, ..) can be inter-
preted as such superadditive functions.
2
In addition, the optimization problem
D : minf∈Γ f(b) | f(Aj) ≥ cj, j = 1, . . . , n
is too rich, and somehow, a rephrasing of the problem: Indeed,
the set Γ contains in particular the optimal value function f :
Zm → R ∪ −∞
b 7→ f(b) := maxx c′x | Ax = b; x ∈ Nn
Similarly, the standard LP problem maxxc′x |A x = b; x ≥ 0 has
an abstract dual problem
DL : minf∈∆
f(b) | f(Aj) ≥ cj, j = 1, . . . , n
where ∆ is now the set fo functions f : Rm → R ∪ −∞ that
are concave (or even concave piecewise linear).
3
But the simpler linear program
D∗L : minf linear
f(b) | f(Aj) ≥ cj, j = 1, . . . , n
that is,
D∗L : minλ b′λ | A′j λ ≥ cj, j = 1, . . . , n
is a valid dual ... for a fixed value of b. It does not use concave
piecewise linear functions (but does not contain the optimal value
function b 7→ f(b) := maxx≥0 c′x |A x = b).
Hence, for integer programs, one should also obtain a dual sim-
pler than the superadditive dual ....
4
CONTINUOUS OPTIM. DISCRETE OPTIM.− | −
f(b, c) := max c′x
s.t.
[Ax ≤ b
x ∈ Rn+
←→
fd(b, c) := max c′x
s.t.
[Ax ≤ b
x ∈ Nn
l − lINTEGRATION SUMMATION
− | −f(b, c) :=
∫Ω
ec′xdx
Ω :=
[Ax ≤ b
x ∈ Rn+
←→fd(b, c) :=
∑Ω
ec′x
Ω :=
[Ax ≤ b
x ∈ Nn
5
ef(b,c) = limr→∞ f(b, rc)1/r; efd(b,c) = lim
r→∞ fd(b, rc)1/r.
or, equivalently
f(b, c) = limr→∞
1
rln f(b, rc); fd(b, c) = lim
r→∞1
rln fd(b, rc).
6
CONTINUOUS OPTIM. DISCRETE OPTIM.− | −
Legendre-FenchelDuality ←→ ??
l − l
INTEGRATION SUMMATION− | −
Laplace-Transform Z-TransformDuality ←→ Duality
7
Legendre-Fenchel duality : f : Rn → R convex; f∗ : Rn → R.
λ 7→ f∗(λ) = F(f)(λ) := supyλ′y − f(y).
(One-sided) Laplace-Transform: f : Rn+ → R; F : Cn → C.
λ 7→ F (λ) = L(f)(λ) :=∫Rn
+
e−λ′y f(y) dy.
(One-sided) Z-Transform: f : Zn+ → R; F : Cn → C.
λ 7→ F (z) = Z(f)(z) :=∑
m∈Zn+
z−mf(m).
8
with Ω := x ∈ Rn | A x ≤ y; x ≥ 0
Fenchel-duality: Laplace-duality
f(y, c) := maxx∈Ω
c′x f(y, c) :=∫Ω
ec′x dx
f∗(λ, c) := minyλ′y − f(y, c) F (λ, c) :=
∫e−λy f(y, c) dy
=∫x≥0
ec′x[∫
Ax≤ye−λy dy
]dx
=1
m∏j=1
λj
n∏k=1
(A′λ− c)k
with : A′λ− c ≥ 0; λ ≥ 0 with : <(A′λ− c) > 0; <(λ) > 09
In standard form, with Ω := x ∈ Rn | A x = b; x ≥ 0
F (λ, c) =1
n∏k=1
(A′λ− c)k
and one retrieves f(y, c) by
f(y, c) =∫Γ
e y′ λ F (λ, c) dλ =∫Γ
e y′ λ∏nj=1(A
′ λ− c)jdλ
Integration by Cauchy’s residue techniques →• The (multidimensional) poles λ of F solve πσ Aσ = cσ for all
bases Aσ of the continuous LP, ... which yields ....
10
Brion and Vergne ’s continuous formula
Terminology of LP in standard form:
Let Aσ := [Aσ1| . . . |Aσm] be a basis of maxc′x|Ax = b;x ≥ 0,with x(σ) the corresponding vertex, πσ := c′σA−1
σ the associated
dual variable and the reduced cost vector ck − πσAk, k 6∈ σ.
Then :
f(b, c) =∑
x(σ): vertex of Ω(b)
ec′x(σ)
det(Aσ)∏k 6∈σ
(−ck + πσAk)
from which it easily follows that
log[lim
r→∞ f(b, rc)]1/r
= maxx(σ) : vertex of Ω(b)
c′ x(σ).
11
Discrete Z-duality
Let Ω(y) := x ∈ Rn | A x = y; x ≥ 0.
fd(y, c) :=∑
x∈Ω(y)∩Zn
ec′x
Then the generating function or Z-transform of fd reads
z 7→ Fd(z, c) :=∑
y∈Zm
z−y fd(y, c),
and, by simple algebra
Fd(z, c) =n∏
j=1
1
1− ecj z−Aj(=
n∏j=1
1
1− ecj z1−A1j . . . zm
−Amj)
with |zAj | > ecj ∀j = 1, . . . , n.
12
To compute fd(y, c) it then suffices to invert the generating func-
tion, that is:
fd(y, c) =∫|z|=γ
zy−1Fd(z, c) dz
which can be done by repeated application of Cauchy’ s residue
Theorem, which in turn requires computing the poles of the
generating function Fd.
• The Poles of Fd are also associated with the bases of the LP
max c′ x |A x = y, x ≥ 0 !!
13
Let σ := [Aσ1| · · · |Aσm] be a feasible basis of the LP
maxxc′ x |A x = y, x ≥ 0, with µ(σ) := det(Aσ), and associated
“dual” variables πσ, which are solutions of πAσ = cσ.
Indeed, each basis σ provides µ(σ) complex poles z = eλ in
Cm, which are solutions of the polynomial equations eA′σλ = ecσ.
(Compare with πσAσ = cσ)
The µ(σ) solutions z = eλ are of the form
λ = πσ + 2iπv
µ(σ); v ∈ Vσ ⊂ Zm
Vσ = v ∈ Zm | v′Aσ = 0 mod µ(σ)
14
Brion and Vergne ’s discrete formula
Re-interpreted with these data, Brion and Vergne ’s original dis-
crete formula reads (Lasserre)
fd(y, c) =∑
x(σ): vertex of Ω(y)
ec′ x(σ) ×
1
µ(σ)
∑
v∈Vσ
e2iπv′ y/µ(σ)∏k 6∈σ
(1− e−(2iπv′Ak/µ(σ)) e(ck−πσAk)
)
15
Back to optimization : fd(b, c) = max c′x |A x = b; x ∈ Nn.
Theorem . Assume that
maxσ
ec′x(σ) × limr→∞
∑
v∈Vσ
e2iπv′b/µ(σ)∏k 6∈σ
(1− e−(2iπv′Ak/µ(σ)) er(ck−πσAk))
1/r
is attained at a unique basis σ∗. Then :
fd(b, c) = c′ x(σ∗) +∑
k 6∈σ∗(ck − πσ∗Ak)x∗k =
∑j
cj x∗j .
σ∗ is an optimal basis of the linear program, and x(σ∗) (resp. x∗)is an optimal solution of the linear (resp. integer) program.
16
In this case :
fd(b, c) = c′x(σ∗) +
max∑
k 6∈σ∗(ck − πσ∗Ak)xk
Aσ∗ u +∑
k 6∈σ∗Ak xk = b
u ∈ Zm; xk ∈ N ∀k 6∈ σ∗
fd(b, c) = c′x(σ∗) + ρ := opt. value of GOMORY relaxation!
17
limr→∞
∑
v∈Vσ
e2iπv′b/µ(σ)∏k 6∈σ
(1− e−(2iπv′Ak/µ(σ)) er(ck−πσAk))
1/r
is the same as detecting the leading term uρ of the rational
fraction
u 7→
∑
v∈Vσ
e2iπv′b/µ(σ)∏k 6∈σ
(1− e−(2iπv′Ak/µ(σ)) u(ck−πσAk))
as u→∞.
18
With u := er, fd(y, rc) is the rational fraction
u 7→∑
x(σ): vertex of Ω(y)
uc′ x(σ) ×
1
µ(σ)
∑
v∈Vσ
e2iπv′ y/µ(σ)∏k 6∈σ
(1− e−(2iπv′Ak/µ(σ)) u(ck−πσAk)
)
=∑
bases σ of Ω(y)
gσ(u)
When Gomory relaxation is not exact, the leading term as u→∞is not unique and cancellations occur.
19
A Discrete Farkas Lemma
Let A ∈ Nm×n, b ∈ Nm and consider the problem deciding whether
or not A x = b has a solution x ∈ Nn, or, equivalently, deciding
whether or not fd(b, c) ≥ 1.
Theorem: (i) A x = b has a solution x ∈ Nn if and only if the
polynomial b 7→ zb − 1 in R[z1, . . . , zm] can be written
zb − 1 =n∑
j=1
Qj(z)(zAj − 1) (=
n∑j=1
Qj(z)(zA1j1 . . . z
Amjm − 1)
for some polynomials z 7→ Qj(z) with nonnegative coefficients.
(ii) The degree of the Qj’s is bounded bym∑
j=1
bj −maxk
m∑j=1
Ajk.
20
A single LP to solve with n×(b∗+m
b∗)
variables,(b∗+m
m
)constraints
and a (sparse) matrix of coefficients in 0,±1
One also retrieves the classical Farkas Lemma in Rn, that is,
x ∈ Rn |A x = b;x ≥ 0 6= ∅ ⇔[A′ u ≥ 0 ⇒ b′ u ≥ 0
]Indeed, if A x = b has a solution x ∈ Nn, then with u = ln z,
eb′u − 1 =n∑
j=1
Qj(eu1, . . . , eum)(e(A′u)j − 1).
Therefore,
A′u ≥ 0 ⇒ e(A′ u)j − 1 ≥ 0 ⇒ eb′u − 1 ≥ 0 ⇒ b′u ≥ 0,
and so A x = b has a nonnegative solution x ∈ Rn+.
21
The general case A ∈ Zm×n, b ∈ Zm.
Let Ω := x ∈ Rn|Ax = b; x ≥ 0 be a polytope.
Let α ∈ Nn be such that for every column Aj of A,
Akj+αj≥ 0 ∀ k = 1, . . . , m; let N 3 β ≥ ρ(α) := maxα′x |x ∈ Ω.
Theorem: (i) Ax = b has a solution x ∈ Nn if and only if thepolynomial z 7→ zb(zy)β − 1 in R[z1, . . . , zm] can be written
zb(zy)β − 1 = Q0(z, y)(zy − 1) +n∑
j=1
Qj(z, y)(zAj(zy)αj − 1),
for some polynomials Qj(z), all with nonnegative coefficients.
(ii) The degree of the Qj’s is bounded by b∗ := (m+1)β+m∑
j=1
bj.
22
Back to standard Farkas lemma
x ∈ Rn |A x = b; x ≥ 0 6= ∅ ⇔[A′λ ≥ 0⇒ b′λ ≥ 0
]
But, equivalently x ∈ Rn |A x = b, x ≥ 0 6= ∅ if and only if thepolynomial λ 7→ b′λ can be written
b′ λ =n∑
j=1
Qj(λ)(A′ λ)j,
for some polynomials Qj ⊂ R[λ1, . . . , λm], all with nonnegativecoefficients.
In this case, each Qj is necessarily a constant, that is, Qj ≡Qj(0) = xj ≥ 0, and A x = b!
23
P = x ∈ Rn |Ax = b, x ≥ 0 P ∩ Zn
x ∈ P x ∈ integer hull (P)⇔ x = Q(0, . . . ,0) with ⇔ x = Q(1, . . . ,1) with
Q ∈ R[λ1, . . . , λm] Q ∈ R[eλ1, . . . , eλm]
b′λ = 〈Q, A′ λ〉 eb′ λ − 1 = 〈Q, eA′ λ − 1n〉
Q 0 Q 0
Comparing continuous and discrete Farkas lemma
An equivalent Linear program
Let 0 ≤ q = qjα ∈ Rns be the coefficients of the Qj’s in
zb − 1 =n∑
j=1
Qj(z) (zAj − 1)
They are solutions of a linear system
M q = r, q ≥ 0
for some matrix M and vector r, both with 0,±1 coefficients.
** M and r are easily obtained from A, b with no computation
Write q = (q1, q2, . . . , qn) with each qj = qjα ∈ Rs, and let
cjα := cj for all α
25
Theorem : Let A ∈ Nm×n, b ∈ Nm, c ∈ Rn.
(i) The integer program P → maxx c′x | A x = b, x ∈ Nn has
same value as the linear program
Q→ maxq
n∑j=1
c′j qj | M q = r; q ≥ 0.
(ii) Let q∗ be an optimal vertex, and let
x∗j :=∑α
q∗jα = Qj(1) j = 1, . . . , n.
Then x∗ ∈ Nn and x∗ is an optimal solution of P.
26
The link with superadditive functions
The LP-dual Q∗ of the linear program Q reads
Q∗ → minππ′ r | M′ π ≥ c.
More precisely, with D :=∏n
j=10,1, . . . , bj ⊂ Nm,
Q∗ →
min
ππ(b)− π(0)
s.t. π(Aj + α)− π(α) ≥ cj, α + Aj ∈ D, j = 1, . . . , n
Let Π := π : D → R, and for every π ∈ Π, let fπ : D → R be
the function
fπ(x) := infx+α∈D
π(x + α)− π(α), x ∈ D
27
For every π ∈ Π, the function fπ is superadditive and fπ(0) = 0.
With Q∗ one may then associate the optimization problem
S∗ →
minπ∈Π
fπ(b)
s.t. fπ(Aj) ≥ cj, j = 1, . . . , n.
Thus, S∗ is a simplified and explicit form of the abstract super-
additive dual, as we only consider finite superadditive functions
f : D → R, instead of f : Nm → R ∪ −∞.
It is the analogue for IP with A x = b of Wolsey’s dual for IP with
A x ≤ b. Note that Q∗ is simpler than S∗.
28
Moreover Q∗ is simpler than S∗ as in the LP Q∗, the functionπ : D → R is NOT required to be super additive!! In S∗ on hasto write the O(|D|2) superadditivity constraints
π(x + α)− π(α) ≥ π(x), x, α ∈ D, with x + α ∈ D,
and the n additional constraints π(Aj) ≥ cj, j = 1, . . . , n.
On the other hand, in Q∗ one only has the n O(|D|) constraints
π(Aj + α)− π(α) ≥ cj α ∈ D, with Aj + α ∈ D,
For instance, in the unbounded knapsack problem
maxxc′ x |
n∑j=1
aj xj = b; x ∈ Nn
One has n + b2/2 constraints in S∗, and nb−∑n
j=1 aj in Q∗.
With P = x ∈ Rn+ |A x = b, the integer hull co(P ∩ Zn) reads
co(P ∩ Zn) = x ∈ Rn | −n∑
j=1
λ∗j xj ≤ b′π∗,
for finitely many (π∗, λ∗), generators of the convex cone Ω ⊂R|D| ×Rn
Ω :=(π, λ) ∈ R|D| ×Rn | π(α + Aj)− π(α) + λj ≥ 0,
α + Aj ∈ D, j = 1, . . . n
Equivalently, again with
x 7→ fπ∗(x) := infα+x∈D
π∗(x + α)− π∗(α), x ∈ D
co(P ∩ Zn) = x ∈ Rn |n∑
j=1
fπ∗(Aj)xj ≤ fπ∗(b),
29
CONCLUSION
Generating functions permit to exhibit a natural duality for
integer programming, an IP-analogue of LP duality.
This duality permits to simplify the abstract superadditive dual,
and so, might help providing efficient Gomory cuts in MIP
solvers like CPLEX, or XPRESS-MP.
30
LasserreJ.B. (2005). Integer Programming, duality and super-
additive functions, Cont. Math 474, pp. 138-150
Lasserre J.B. (2004). Integer Programming Duality. in: Trends
in Optimization. Proc. Symp. Appl. Math. 61, pp. 67–83
Lasserre J.B. (2004). The integer hull of a convex rational poly-
tope. Discr. Comp. Geom 32, pp. 129–139.
Lasserre J.B. (2004). Generating functions and duality for inte-
ger programs. Discrete Optim. 1, pp. 167–187.
Lasserre J.B. (2004). A discrete Farkas lemma. Discrete Opti-
mization 1, pp. 67-75. .
Another dual problem
Let A ∈ Zm×n, b ∈ Zm, c ∈ Rn. Let y 7→ f(y, c) = maxc′x |Ax =y; x ≥ 0. The Fenchel transform (−f)∗ of the convex function−f(., c) is the convex function
λ 7→ (−f)∗(λ, c) = supy∈Rm
λ′y + f(y, c).
The dual problem of the linear program is obtained from Fenchelduality as
f(b, c) = infλ∈Rm
b′ λ + (−f)∗(−λ, c)
= infλ∈Rm
b′ λ + supx∈Rm
+
(c−A′λ)′x = minλb′ λ |A′ λ ≥ c
Equivalently
ef(b,c) = infλ∈Rm
supx∈Rm
+
e(b−A x)′λec′ x
32
Define
ρ∗ := infz∈Cm
supx∈Nn
<(zb−A x ec′ x
)= inf
z∈Cmf∗d(z, c).
Hence,
f∗d(z, c) = <
zbn∏
j=1
(z−Ajecj)xj
< ∞ if |zAj | ≥ ecj ∀j
(that is, A′ ln |z| ≥ c). Next, (writing z ∈ C as eλeiθ)
ρ∗ ≤ infz∈Rm
supx∈Rn
+
<(zb−Ax ec′x
)= inf
λ∈Rmsup
x∈Rn+
e(b−Ax)′λ ec′x = ef(b,c)
Finally, with z ∈ Cm arbitrary fixed
supx∈Nn
<(zb−Ax ec′x
)≥ ec′x∗ = efd(b,c)
Hence fd(b, c) ≤ ln ρ∗ ≤ f(b, c).
33
Theorem: Let σ∗ be an optimal basis of the linear program.
Under uniqueness of the “maxσ” in Brion and Vergne ’s formula,
and an additional technical condition
efd(b,c) = ρ∗ = maxx∈Nn
<(zb−A xec′ x
)= f∗d(z, c)
where zAj = γ ecj ∀j ∈ σ∗ for some real γ > 1.
z is an optimal solution of the dual problem
infz∈Cm
f∗d(z, c)
34