ExtendedForward-BackwardAlgorithm with n “proximable”...

39
Extended Forward-Backward Algorithm with n “proximable” functions Hugo Raguet * Jalal Fadili , Gabriel Peyré May 25, 2011 * CEREMADE, [email protected] ENSICAEN CEREMADE

Transcript of ExtendedForward-BackwardAlgorithm with n “proximable”...

  • Extended Forward-Backward Algorithmwith n “proximable” functions

    Hugo Raguet∗Jalal Fadili†, Gabriel Peyré‡

    May 25, 2011

    ∗CEREMADE, [email protected]†ENSICAEN‡CEREMADE

  • A Common Problem...

    Find x minimizing F (x) + G(x)Find x minimizing 12 ∣∣Y −ΦΨx ∣∣2 + λ∣∣x ∣∣1

    x

    coefficientsx

    ↑ Ψ

    x

    F

    data-fidelity

    Φ→

    Φx +w = Y

    12 ∣∣Y −Φx ∣∣2

    G

    penalization

    L1-norm ∑i ∣xi ∣

  • A Common Problem...

    Find x minimizing F (x) + G(x)Find x minimizing 12 ∣∣Y −ΦΨx ∣∣2 + λ∣∣x ∣∣1

    x

    coefficientsx

    ↑ Ψ

    x

    F

    data-fidelity

    Φ→

    Φx +w = Y

    12 ∣∣Y −Φx ∣∣2

    G

    penalization

    L1-norm ∑i ∣xi ∣

  • A Common Problem...

    Find x minimizing F (x) + G(x)Find x minimizing 12 ∣∣Y −ΦΨx ∣∣2 + λ∣∣x ∣∣1

    x

    coefficientsx

    ↑ Ψ

    x

    F

    data-fidelity

    Φ→

    Φx +w = Y

    12 ∣∣Y −Φx ∣∣2

    G

    penalization

    L1-norm ∑i ∣xi ∣

  • A Common Problem...

    Find x minimizing F (x) + G(x)Find x minimizing 12 ∣∣Y −ΦΨx ∣∣2 + λ∣∣x ∣∣1

    x

    coefficientsx

    ↑ Ψ

    x

    F

    data-fidelity

    Φ→

    Φx +w = Y

    12 ∣∣Y −Φx ∣∣2

    G

    penalization

    L1-norm ∑i ∣xi ∣

  • A Common Problem...

    Find x minimizing F (x) + G(x)Find x minimizing 12 ∣∣Y −ΦΨx ∣∣2 + λ∣∣x ∣∣1

    x

    coefficientsΨx

    ↑ Ψ

    x

    F

    data-fidelity

    Φ→

    ΦΨx +w = Y

    12 ∣∣Y −ΦΨx ∣∣2

    G

    penalization

    L1-norm ∑i ∣xi ∣

  • A Common Problem...

    Find x minimizing F (x) + G(x)Find x minimizing 12 ∣∣Y −ΦΨx ∣∣2 + λ∣∣x ∣∣1

    x

    coefficientsΨx

    ↑ Ψ

    x

    F

    data-fidelity

    Φ→

    ΦΨx +w = Y

    12 ∣∣Y −ΦΨx ∣∣2

    G

    penalization

    L1-norm ∑i ∣xi ∣

  • A Common Problem...

    Find x minimizing F (x) + G(x)Find x minimizing 12 ∣∣Y −ΦΨx ∣∣2 + λ∣∣x ∣∣1

    x

    coefficientsΨx

    ↑ Ψ

    x

    F

    data-fidelity

    Φ→

    ΦΨx +w = Y

    12 ∣∣Y −ΦΨx ∣∣2

    G

    penalization

    L1-norm ∑i ∣xi ∣

  • A Common Problem...

    Find x minimizing F (x) + G(x)

    Find x minimizing 12 ∣∣Y −ΦΨx ∣∣2 + λ∣∣x ∣∣1

    x

    coefficientsΨx

    ↑ Ψ

    x

    F

    data-fidelity

    Φ→

    ΦΨx +w = Y

    12 ∣∣Y −ΦΨx ∣∣2

    G

    penalization

    L1-norm ∑i ∣xi ∣

  • A Common Problem...Find x minimizing F (x) + G(x)Find x minimizing 12 ∣∣Y −ΦΨx ∣∣2 + λ∣∣x ∣∣1

    x coefficientsΨx

    ↑ Ψ

    x

    F data-fidelity

    Φ→

    ΦΨx +w = Y

    12 ∣∣Y −ΦΨx ∣∣2

    G penalization

    L1-norm ∑i ∣xi ∣

  • Towards More Complex Penalization

    ∣∣x ∣∣1 = ∑i ∣xi ∣ ∑b∈B√∑i∈b x2i

    ∑b∈B1√∑i∈b x2i+

    ∑b∈B2√∑i∈b x2i

    Decomposition G = ∑k Gk

  • Towards More Complex Penalization

    ∣∣x ∣∣1 = ∑i ∣xi ∣

    ∑b∈B√∑i∈b x2i

    ∑b∈B1√∑i∈b x2i+

    ∑b∈B2√∑i∈b x2i

    Decomposition G = ∑k Gk

  • Towards More Complex Penalization

    ∣∣x ∣∣1 = ∑i ∣xi ∣ ∑b∈B√∑i∈b x2i

    ∑b∈B1√∑i∈b x2i+

    ∑b∈B2√∑i∈b x2i

    Decomposition G = ∑k Gk

  • Towards More Complex Penalization

    ∣∣x ∣∣1 = ∑i ∣xi ∣ ∑b∈B√∑i∈b x2i

    ∑b∈B1√∑i∈b x2i+

    ∑b∈B2√∑i∈b x2i

    Decomposition G = ∑k Gk

  • Towards More Complex Penalization

    ∣∣x ∣∣1 = ∑i ∣xi ∣ ∑b∈B√∑i∈b x2i

    ∑b∈B1√∑i∈b x2i+

    ∑b∈B2√∑i∈b x2i

    Decomposition G = ∑k Gk

  • Conditions

    Interesting Properties

    F convex smoothG convex “proximable”proxG(x)

    def= argminy

    12 ∣∣x − y ∣∣2 +G(y)

    G(x) = λ⇒ proxG(x) = Sλ(x)

    −5 −4 −3 −2 −1 0 1 2 3 4 5−3

    −2

    −1

    0

    1

    2

    3

    λλ−λ λλ

    xi

    Sλ(xi)

    soft thresholding

    To Deal with High Dimensionnality

    need for quick and easy access to the operators

    avoid nested algorithms

  • Conditions

    Interesting Properties

    F convex smooth

    G convex “proximable”proxG(x)

    def= argminy

    12 ∣∣x − y ∣∣2 +G(y)

    G(x) = λ⇒ proxG(x) = Sλ(x)

    −5 −4 −3 −2 −1 0 1 2 3 4 5−3

    −2

    −1

    0

    1

    2

    3

    λλ−λ λλ

    xi

    Sλ(xi)

    soft thresholding

    To Deal with High Dimensionnality

    need for quick and easy access to the operators

    avoid nested algorithms

  • Conditions

    Interesting Properties

    F convex smoothF (x) = 12 ∣∣Y − Lx ∣∣2

    ⇒ ∇F (x) = L⋆ (Lx −Y )

    G(x) = λ⇒ proxG(x) = Sλ(x)

    −5 −4 −3 −2 −1 0 1 2 3 4 5−3

    −2

    −1

    0

    1

    2

    3

    λλ−λ λλ

    xi

    Sλ(xi)

    soft thresholding

    To Deal with High Dimensionnality

    need for quick and easy access to the operators

    avoid nested algorithms

  • Conditions

    Interesting Properties

    F convex smoothG convex “proximable”proxG(x)

    def= argminy

    12 ∣∣x − y ∣∣2 +G(y)

    G(x) = λ⇒ proxG(x) = Sλ(x)

    −5 −4 −3 −2 −1 0 1 2 3 4 5−3

    −2

    −1

    0

    1

    2

    3

    λλ−λ λλ

    xi

    Sλ(xi)

    soft thresholding

    To Deal with High Dimensionnality

    need for quick and easy access to the operators

    avoid nested algorithms

  • Conditions

    Interesting Properties

    F convex smoothG convex “proximable”proxG(x)

    def= argminy

    12 ∣∣x − y ∣∣2 +G(y)

    G(x) = λ∑i ∣xi ∣⇒ proxG(x) = Sλ(x)

    −5 −4 −3 −2 −1 0 1 2 3 4 5−3

    −2

    −1

    0

    1

    2

    3

    λλ−λ λλ

    xi

    Sλ(xi)

    soft thresholding

    To Deal with High Dimensionnality

    need for quick and easy access to the operators

    avoid nested algorithms

  • Conditions

    Interesting Properties

    F convex smoothG convex “proximable”proxG(x)

    def= argminy

    12 ∣∣x − y ∣∣2 +G(y)

    G(x) = λ∑b∈B√∑i x2i

    ⇒ proxG(x) = SλB(x)−5 −4 −3 −2 −1 0 1 2 3 4 5

    −3

    −2

    −1

    0

    1

    2

    3

    λλ−λ λλ

    xi

    Sλ(xi)

    soft thresholding

    To Deal with High Dimensionnality

    need for quick and easy access to the operators

    avoid nested algorithms

  • Conditions

    Interesting Properties

    F convex smoothG convex “proximable”proxG(x)

    def= argminy

    12 ∣∣x − y ∣∣2 +G(y)

    G(x) = λ∑k ∑b∈Bk√∑i x2i

    ⇒ proxG(x) = No closed form expression

    −5 −4 −3 −2 −1 0 1 2 3 4 5−3

    −2

    −1

    0

    1

    2

    3

    λλ−λ λλ

    xi

    Sλ(xi)

    soft thresholding

    To Deal with High Dimensionnality

    need for quick and easy access to the operators

    avoid nested algorithms

  • Conditions

    Interesting Properties

    F convex smoothG convex “proximable”proxG(x)

    def= argminy

    12 ∣∣x − y ∣∣2 +G(y)

    G(x) = λ∑k ∑b∈Bk√∑i x2i

    ⇒ proxG(x) = No closed form expression

    −5 −4 −3 −2 −1 0 1 2 3 4 5−3

    −2

    −1

    0

    1

    2

    3

    λλ−λ λλ

    xi

    Sλ(xi)

    soft thresholding

    To Deal with High Dimensionnality

    need for quick and easy access to the operators

    avoid nested algorithms

  • Conditions

    Interesting Properties

    F convex smoothG convex “proximable”proxG(x)

    def= argminy

    12 ∣∣x − y ∣∣2 +G(y)

    G(x) = λ∑k ∑b∈Bk√∑i x2i

    ⇒ proxG(x) = No closed form expression

    −5 −4 −3 −2 −1 0 1 2 3 4 5−3

    −2

    −1

    0

    1

    2

    3

    λλ−λ λλ

    xi

    Sλ(xi)

    soft thresholding

    To Deal with High Dimensionnality

    need for quick and easy access to the operators

    avoid nested algorithms

  • State-of-the-Art

    Forward-Backward F (x) +G(x)

    x ← proxγG (x − γ∇F (x))

    Peaceman-Rachford ∑k Gk(x)

    F (x) = 12 ∣∣Y − Lx ∣∣2⇒ proxF(x) = (Id+L⋆L)-1 (x + L⋆Y );

    Preconditionned ADMM ∑k Gk(Lkx)

    Using ∇F ?

    proxF not necessary availableconvergence speed and accelarated schemes

  • State-of-the-Art

    Forward-Backward F (x) +G(x)

    x ← proxγG (x − γ∇F (x))

    Peaceman-Rachford ∑k Gk(x)

    F (x) = 12 ∣∣Y − Lx ∣∣2⇒ proxF(x) = (Id+L⋆L)-1 (x + L⋆Y );

    Preconditionned ADMM ∑k Gk(Lkx)

    Using ∇F ?

    proxF not necessary availableconvergence speed and accelarated schemes

  • State-of-the-Art

    Forward-Backward F (x) +G(x)

    x ← proxγG (x − γ∇F (x))

    Peaceman-Rachford ∑k Gk(x)

    F (x) = 12 ∣∣Y − Lx ∣∣2⇒ proxF(x) = (Id+L⋆L)-1 (x + L⋆Y );

    Preconditionned ADMM ∑k Gk(Lkx)

    Using ∇F ?

    proxF not necessary availableconvergence speed and accelarated schemes

  • State-of-the-Art

    Forward-Backward F (x) +G(x)

    x ← proxγG (x − γ∇F (x))

    Peaceman-Rachford ∑k Gk(x)

    F (x) = 12 ∣∣Y − Lx ∣∣2⇒ proxF(x) = (Id+L⋆L)-1 (x + L⋆Y );

    Preconditionned ADMM ∑k Gk(Lkx)

    Using ∇F ?proxF not necessary available

    convergence speed and accelarated schemes

  • State-of-the-Art

    Forward-Backward F (x) +G(x)

    x ← proxγG (x − γ∇F (x))

    Peaceman-Rachford ∑k Gk(x)

    F (x) = 12 ∣∣Y − Lx ∣∣2⇒ proxF(x) = (Id+L⋆L)-1 (x + L⋆Y );

    Preconditionned ADMM ∑k Gk(Lkx)

    Using ∇F ?proxF not necessary availableconvergence speed and accelarated schemes

  • Extended Forward-Backward

    Algorithm for minx F (x) +∑nk=1 Gk(x)

    ∀ k ∈ J1,nK, zk ← zk + (proxγGk (2x − zk −γ

    n∇F (x)) − x)

    x ← 1n∑k

    zk

  • Extended Forward-Backward

    Algorithm for minx F (x) +∑nk=1 Gk(x)

    ∀ k ∈ J1,nK, zk ← zk + (proxγGk (2x − zk −γ

    n∇F (x)) − x)

    x ← 1n∑k

    zk

    Theorem: Convergence with Robustness to Errorsif

    ∇F 1/β-Lipschitz continuous;γ ∈]0,2nβ[;Set of minimizers non-empty;

    ∀ k , ∑∞iter ∣∣�k ∣∣

  • Extended Forward-Backward

    Algorithm for minx F (x) +∑nk=1 Gk(x)

    ∀ k ∈ J1,nK, zk ← zk + (proxγGk (2x − zk −γ

    n∇F (x)+�0)+�k − x)

    x ← 1n∑k

    zk

    Theorem: Convergence with Robustness to Errorsif

    ∇F 1/β-Lipschitz continuous;γ ∈]0,2nβ[;Set of minimizers non-empty;∀ k , ∑∞iter ∣∣�k ∣∣

  • Extended Forward-Backward

    Algorithm for minx F (x) +∑nk=1 Gk(x)

    ∀ k ∈ J1,nK, zk ← zk + (proxγGk (2x − zk −γ

    n∇F (x)) − x)

    x ← 1n∑k

    zk

    Hybrid Forward-Backward/Peaceman-Rachford

    If n ≡ 1, x = z = z1,

    If F ≡ 0, this is a special case of Peaceman-Rachford.

  • Extended Forward-Backward

    Algorithm for minx F (x) +∑nk=1 Gk(x)

    �����

    �XXXXXX∀ k ∈ J1,nK, z�k ←��zk + (proxγG�k(�2x��−zk −

    γ

    �n∇F (x))��−x)

    x ←����SSSS

    1n∑k

    z�k

    Hybrid Forward-Backward/Peaceman-Rachford

    If n ≡ 1, x = z = z1, this is forward-backward.

    If F ≡ 0, this is a special case of Peaceman-Rachford.

  • Extended Forward-Backward

    Algorithm for minx F (x) +∑nk=1 Gk(x)

    ∀ k ∈ J1,nK, zk ← zk + (proxγGk (2x − zk��

    ����HHHHHH

    −γn∇F (x)) − x)

    x ← 1n∑k

    zk

    Hybrid Forward-Backward/Peaceman-Rachford

    If n ≡ 1, x = z = z1, this is forward-backward.If F ≡ 0, this is a special case of Peaceman-Rachford.

  • Numerical ExperimentsDeconvolution minx 12 ∣∣Y −K ∗Ψx ∣∣ +∑

    16k=1 ∣∣x ∣∣

    Bk1,2

    0 200 400 600 800−2

    −1

    0

    1

    2

    3

    4

    tEFB

    =1.35e+04 s; tPR

    =1.39e+04 s; tADMM

    =1.35e+04 s

    iteration #

    log(

    E−

    Em

    in)

    ExtFBPRADMM

  • Numerical ExperimentsDeconvolution + Inpainting minx 12 ∣∣Y − PΩK ∗Ψx ∣∣ +∑

    16k=1 ∣∣x ∣∣

    Bk1,2

    0 200 400 600 800−2

    −1

    0

    1

    2

    3

    4

    tEFB

    =1.32e+04 s; tPR

    =1.88e+04 s; tADMM

    =1.13e+04 s

    iteration #

    log 1

    0(E

    −E

    min

    )

    ExtFBPRADMM

  • Conclusion

    Extended Forward-BackwardEnlarges the class of convex optimization problems whichcan be solved in a “reasonable amount of time”;Robust to errors.

    Future WorkConvergence speed;Accelarated scheme.

  • Aknowledgement

    Gabriel Peyré, (CEREMADE)for believing that such an extended forward-backward waspossible

    Jalal Fadili, (ENSICAEN)for giving the impression to know everything about convexoptimization

    Nicolas Schmidt, (CEREMADE)for mathematical rigor and friendship