Complexity Analysis beyond Convex Optimization - Stanford University

69
Complexity Analysis beyond Convex Optimization Yinyu Ye K. T. Li Professor of Engineering Department of Management Science and Engineering Stanford University http://www.stanford.edu/˜yyye August 1, 2013 Yinyu Ye ICCOPT 2013

Transcript of Complexity Analysis beyond Convex Optimization - Stanford University

Page 1: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analysis beyond Convex Optimization

Yinyu Ye

K. T. Li Professor of EngineeringDepartment of Management Science and Engineering

Stanford University

http://www.stanford.edu/˜yyye

August 1, 2013

Yinyu Ye ICCOPT 2013

Page 2: Complexity Analysis beyond Convex Optimization - Stanford University

Outline

� Application arisen from Non-Convex Regularization

� Theory of the Lp-norm Regularization

� Selected Complexity Results for Non-Convex Optimization

� High-Level Complexity Analyses for Few cases

� Open Questions

Yinyu Ye ICCOPT 2013

Page 3: Complexity Analysis beyond Convex Optimization - Stanford University

Unconstrained L2+Lp Minimization

Consider the problem:

Minimizex∈Rn f2p(x) := ‖Ax− b‖22 + λ‖x‖pp (1)

where data A ∈ Rm×n,b ∈ Rm, parameter 0 ≤ p ≤ 1, and

‖x‖pp =n∑

j=1

|xj |p .

Yinyu Ye ICCOPT 2013

Page 4: Complexity Analysis beyond Convex Optimization - Stanford University

Unconstrained L2+Lp Minimization

Consider the problem:

Minimizex∈Rn f2p(x) := ‖Ax− b‖22 + λ‖x‖pp (1)

where data A ∈ Rm×n,b ∈ Rm, parameter 0 ≤ p ≤ 1, and

‖x‖pp =n∑

j=1

|xj |p .

‖x‖0 := |{j : xj �= 0}|that is, the number of nonzero entries in x.

Yinyu Ye ICCOPT 2013

Page 5: Complexity Analysis beyond Convex Optimization - Stanford University

Unconstrained L2+Lp Minimization

Consider the problem:

Minimizex∈Rn f2p(x) := ‖Ax− b‖22 + λ‖x‖pp (1)

where data A ∈ Rm×n,b ∈ Rm, parameter 0 ≤ p ≤ 1, and

‖x‖pp =n∑

j=1

|xj |p .

‖x‖0 := |{j : xj �= 0}|that is, the number of nonzero entries in x.

A more general model: for q ≥ 1

Minimizex∈Rn fqp(x) := ‖Ax− b‖qq + λ‖x‖pp

Yinyu Ye ICCOPT 2013

Page 6: Complexity Analysis beyond Convex Optimization - Stanford University

Constrained Lp Minimization

Consider another problem:

Minimize∑

1≤j≤n

xpj

Subject to Ax = b,x ≥ 0,

(2)

Yinyu Ye ICCOPT 2013

Page 7: Complexity Analysis beyond Convex Optimization - Stanford University

Constrained Lp Minimization

Consider another problem:

Minimize∑

1≤j≤n

xpj

Subject to Ax = b,x ≥ 0,

(2)

orMinimize

∑1≤j≤n

|xj |p

Subject to Ax = b.(3)

Yinyu Ye ICCOPT 2013

Page 8: Complexity Analysis beyond Convex Optimization - Stanford University

Application and Motivation

The original goal is to minimize ‖x‖0 = |{j : xj �= 0}|, the size ofthe support set of x, such that Ax = b, for

� Sparse image reconstruction

� Sparse signal recovering

� Sensor network localization

which is known to be an NP-Hard problem.

Yinyu Ye ICCOPT 2013

Page 9: Complexity Analysis beyond Convex Optimization - Stanford University

Approximation of ‖x‖0

� ‖x‖1 has been used to approximate ‖x‖0, and theregularization can be exact under certain strong conditions(Donoho 2004, Candes and Tao 2005, etc). Thisregularization model is actually a linear program.

Yinyu Ye ICCOPT 2013

Page 10: Complexity Analysis beyond Convex Optimization - Stanford University

Approximation of ‖x‖0

� ‖x‖1 has been used to approximate ‖x‖0, and theregularization can be exact under certain strong conditions(Donoho 2004, Candes and Tao 2005, etc). Thisregularization model is actually a linear program.

� Theoretical and empirical computational results indicate that‖x‖p regularization, say p = .5, have better performancesunder weaker conditions, and it is solvable equally efficientlyin practice (Chartrand 2009, Xu et al. 2009, etc).

Yinyu Ye ICCOPT 2013

Page 11: Complexity Analysis beyond Convex Optimization - Stanford University

The Hardness of Lp (Ge et al. 2011, Chen et al. 2012)

Question: is L2 + Lp minimization easier than L2 + L0minimization?

Yinyu Ye ICCOPT 2013

Page 12: Complexity Analysis beyond Convex Optimization - Stanford University

The Hardness of Lp (Ge et al. 2011, Chen et al. 2012)

Question: is L2 + Lp minimization easier than L2 + L0minimization?

TheoremDecide the global minimum of optimization problem Lq + Lp isstrongly NP-hard for any given 0 ≤ p < 1, q ≥ 1 and λ > 0.

Yinyu Ye ICCOPT 2013

Page 13: Complexity Analysis beyond Convex Optimization - Stanford University

The Hardness of Lp (Ge et al. 2011, Chen et al. 2012)

Question: is L2 + Lp minimization easier than L2 + L0minimization?

TheoremDecide the global minimum of optimization problem Lq + Lp isstrongly NP-hard for any given 0 ≤ p < 1, q ≥ 1 and λ > 0.

Nevertheless, practitioners solve them using non-linear solvers tocompute an KKT solution...

Yinyu Ye ICCOPT 2013

Page 14: Complexity Analysis beyond Convex Optimization - Stanford University

Recover Result: L0.5-Norm vs. L1 Norm

0 2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Sparsity

Fre

quen

cy o

f Suc

cess

Gaussian Random Matrices

L

1

Lp

Figure: Successful sparse recovery rates of L0 5 and L1 solutions, withYinyu Ye ICCOPT 2013

Page 15: Complexity Analysis beyond Convex Optimization - Stanford University

Sensor Network Localization

Given a graph G = (V ,E ) and sets of non–negative weights, say{dij : (i , j) ∈ E}, the goal is to compute a realization of G in theEuclidean space Rd for a given low dimension d , i.e.

� to place the vertexes of G in Rd such that

� the Euclidean distance between a pair of adjacent vertexes(i , j) equals to (or bounded by) the prescribed weight dij ∈ E .

Yinyu Ye ICCOPT 2013

Page 16: Complexity Analysis beyond Convex Optimization - Stanford University

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

Figure: 50-node 2-D Sensor Localization

Yinyu Ye ICCOPT 2013

Page 17: Complexity Analysis beyond Convex Optimization - Stanford University

Application to SNL

SNL-SDP:

minimize∑

(i ,j)∈E α2ij

subject to (ei − ej )(ei − ej)T • Y = d2

ij + αij , ∀ (i , j) ∈ E ,

Y � 0.

Yinyu Ye ICCOPT 2013

Page 18: Complexity Analysis beyond Convex Optimization - Stanford University

Application to SNL

SNL-SDP:

minimize∑

(i ,j)∈E α2ij

subject to (ei − ej )(ei − ej)T • Y = d2

ij + αij , ∀ (i , j) ∈ E ,

Y � 0.

Regularized SNL-SDP:

minimize∑

(i ,j)∈E α2ij + λ‖Y ‖p

subject to (ei − ej )(ei − ej)T • Y = d2

ij + αij , ∀ (i , j) ∈ E ,

Y � 0.

Yinyu Ye ICCOPT 2013

Page 19: Complexity Analysis beyond Convex Optimization - Stanford University

Schatten p-norm (Ji et al. 2013)

For any given symmetric matrix Y ∈ Sn,

‖Y ‖p =

⎛⎝∑

j

|λ(Y )j |p⎞⎠

1/p

, 0 < p ≤ 1

is known as the Schatten p-quasi-norm of Y .

Yinyu Ye ICCOPT 2013

Page 20: Complexity Analysis beyond Convex Optimization - Stanford University

Schatten p-norm (Ji et al. 2013)

For any given symmetric matrix Y ∈ Sn,

‖Y ‖p =

⎛⎝∑

j

|λ(Y )j |p⎞⎠

1/p

, 0 < p ≤ 1

is known as the Schatten p-quasi-norm of Y .

When p = 1, it is called Nuclear norm.

Yinyu Ye ICCOPT 2013

Page 21: Complexity Analysis beyond Convex Optimization - Stanford University

Schatten p-norm (Ji et al. 2013)

For any given symmetric matrix Y ∈ Sn,

‖Y ‖p =

⎛⎝∑

j

|λ(Y )j |p⎞⎠

1/p

, 0 < p ≤ 1

is known as the Schatten p-quasi-norm of Y .

When p = 1, it is called Nuclear norm.

The Schatten pquasinorm has several nice analytical propertiesthat make it a natural candidate for a regularizer.

Yinyu Ye ICCOPT 2013

Page 22: Complexity Analysis beyond Convex Optimization - Stanford University

Recover Result: Schatten 0.5-Norm vs. Nuclear Norm

−0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

recovered sensorexact sensorAnchor

−0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

recovered sensorexact sensorAnchor

Yinyu Ye ICCOPT 2013

Page 23: Complexity Analysis beyond Convex Optimization - Stanford University

0 25 50 75 100 125 1500

20

40

60

80

100

Extra Edges Added

Re

co

ve

r C

ase

s

Exactly Recover Cases (50 sensors)

Trace(Y)

Trace(Yp)

Yinyu Ye ICCOPT 2013

Page 24: Complexity Analysis beyond Convex Optimization - Stanford University

Theory of L2+Lp Minimization I

Theorem(The first order bound) Let x∗ be any first-order KKT point andlet

Li =

(λp

2‖ai‖√

f2p(x∗)

) 11−p

.

Then

for any i ∈ N , x∗i ∈ (−Li , Li ) ⇒ x∗i = 0.

Yinyu Ye ICCOPT 2013

Page 25: Complexity Analysis beyond Convex Optimization - Stanford University

Theory of L2+Lp Minimization I

Theorem(The first order bound) Let x∗ be any first-order KKT point andlet

Li =

(λp

2‖ai‖√

f2p(x∗)

) 11−p

.

Then

for any i ∈ N , x∗i ∈ (−Li , Li ) ⇒ x∗i = 0.

“Lower Bound Theory of Nonzero Entries in Solutions of L2-LpMinimization” (Chen, Xu and Y), SIAM J. Scientific Computing32:5 (2010) 2832-2852.

Yinyu Ye ICCOPT 2013

Page 26: Complexity Analysis beyond Convex Optimization - Stanford University

Theory of L2+Lp Minimization II

Theorem(The second order bound) Let x∗ be any second-order KKT point

and let Li =

(λp(1− p)

2‖ai‖2) 1

2−p

, i ∈ N . Then

(1)for any i ∈ N , x∗i ∈ (−Li , Li ) ⇒ x∗i = 0.

(2) The support columns of x∗ are linearly independent.

Yinyu Ye ICCOPT 2013

Page 27: Complexity Analysis beyond Convex Optimization - Stanford University

More Theoretical Developments ...

Markowitz Portfolio Model:

minimize 12x

TQx+ cT xsubject to eT x = 1, x ≥ 0

Yinyu Ye ICCOPT 2013

Page 28: Complexity Analysis beyond Convex Optimization - Stanford University

More Theoretical Developments ...

Markowitz Portfolio Model:

minimize 12x

TQx+ cT xsubject to eT x = 1, x ≥ 0

Regularized Markowitz Portfolio Model:

minimize 12x

TQx+ cT x+ λ‖x‖ppsubject to eT x = 1, x ≥ 0

Yinyu Ye ICCOPT 2013

Page 29: Complexity Analysis beyond Convex Optimization - Stanford University

Linearly Constrained Optimization Problem

(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.

Yinyu Ye ICCOPT 2013

Page 30: Complexity Analysis beyond Convex Optimization - Stanford University

Linearly Constrained Optimization Problem

(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.

The first-order KKT conditions:

xj(∇f (x) − ATy)j = 0, ∀jAx = b∇f (x)− ATy ≥ 0, x ≥ 0.

Yinyu Ye ICCOPT 2013

Page 31: Complexity Analysis beyond Convex Optimization - Stanford University

Linearly Constrained Optimization Problem

(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.

The first-order KKT conditions:

xj(∇f (x) − ATy)j = 0, ∀jAx = b∇f (x)− ATy ≥ 0, x ≥ 0.

First-order ε-KKT solution: |xj(∇f (x)− ATy)j | ≤ ε for all j .

Yinyu Ye ICCOPT 2013

Page 32: Complexity Analysis beyond Convex Optimization - Stanford University

Linearly Constrained Optimization Problem

(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.

The first-order KKT conditions:

xj(∇f (x) − ATy)j = 0, ∀jAx = b∇f (x)− ATy ≥ 0, x ≥ 0.

First-order ε-KKT solution: |xj(∇f (x)− ATy)j | ≤ ε for all j .Second-order ε-KKT solution if additionally the Hessian in the nullspace of active constraints is ε-positive-semidefinite.

Yinyu Ye ICCOPT 2013

Page 33: Complexity Analysis beyond Convex Optimization - Stanford University

Iteration Bound for an ε-KKT Solution

Smooth Lipschitz Non-Lipschitzlog log(ε−1) [Y 1992] Ball-IQPε−1 log(ε−1) [Y 1998] IQP [Ge et al 2011]

Constrained Lpε−3/2 [Nesterov et al

2006];[Bian et al2012]

[Cartis et al 2011]ε−2 [Vavasis 1991],

[Nesterov 2004];[Vavasis 1991] [Bian et al

2012];[Gratton et al2008]

[Cartis et al2011]

[Bian et al2012]

ε−3 log(ε−1) [Garmanjani etal 2012]

Table: Selected worst-case complexity results for nonconvex optimization

Yinyu Ye ICCOPT 2013

Page 34: Complexity Analysis beyond Convex Optimization - Stanford University

Ball or Sphere-Constrained Indefinite QP

(BQP)minimize 1

2xTQx+ cT x

subject to ‖x‖2 = (≤)1.

Yinyu Ye ICCOPT 2013

Page 35: Complexity Analysis beyond Convex Optimization - Stanford University

Ball or Sphere-Constrained Indefinite QP

(BQP)minimize 1

2xTQx+ cT x

subject to ‖x‖2 = (≤)1.

The solution x of problem (BQP) satisfies the following necessaryand sufficient conditions (S-Lemma):

(Q + μI )x = −c,(Q + μI ) � 0,

and ‖x‖ = 1.

Yinyu Ye ICCOPT 2013

Page 36: Complexity Analysis beyond Convex Optimization - Stanford University

Ball or Sphere-Constrained Indefinite QP

(BQP)minimize 1

2xTQx+ cT x

subject to ‖x‖2 = (≤)1.

The solution x of problem (BQP) satisfies the following necessaryand sufficient conditions (S-Lemma):

(Q + μI )x = −c,(Q + μI ) � 0,

and ‖x‖ = 1.

This is an SDP problem, and the simplest trust-region sub-problem(More, Sorensen, Dennis and Schnabel, etc. 1980).

Yinyu Ye ICCOPT 2013

Page 37: Complexity Analysis beyond Convex Optimization - Stanford University

The Bisection Method

For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to

(Q + μI )x = −c.

Yinyu Ye ICCOPT 2013

Page 38: Complexity Analysis beyond Convex Optimization - Stanford University

The Bisection Method

For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to

(Q + μI )x = −c.

Assume ‖x(−λ(Q))‖ > 1 (the othe case can be handled easily)and note

μ ≤ −λ(Q) + ‖c‖.

Yinyu Ye ICCOPT 2013

Page 39: Complexity Analysis beyond Convex Optimization - Stanford University

The Bisection Method

For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to

(Q + μI )x = −c.

Assume ‖x(−λ(Q))‖ > 1 (the othe case can be handled easily)and note

μ ≤ −λ(Q) + ‖c‖.Thus, one can apply the bisection method to find the right μ andsolve the problem in polynomial-time log(ε−1) steps.

Yinyu Ye ICCOPT 2013

Page 40: Complexity Analysis beyond Convex Optimization - Stanford University

The Bisection Method

For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to

(Q + μI )x = −c.

Assume ‖x(−λ(Q))‖ > 1 (the othe case can be handled easily)and note

μ ≤ −λ(Q) + ‖c‖.Thus, one can apply the bisection method to find the right μ andsolve the problem in polynomial-time log(ε−1) steps.

We can do it in log-polynomial time using the Steve Smale 1986work on Newton’s method ...

Yinyu Ye ICCOPT 2013

Page 41: Complexity Analysis beyond Convex Optimization - Stanford University

Combined Bisection and Newton’s Method

Yinyu Ye ICCOPT 2013

Page 42: Complexity Analysis beyond Convex Optimization - Stanford University

Potential Reduction Algorithm for LCOP

Consider the (concave+convex) Karmarkar potential function

φ(x) = ρ log(f (x)) −∑nj=1 log xj ,

where we assume that f (x) is nonnegative in the feasible region.

Yinyu Ye ICCOPT 2013

Page 43: Complexity Analysis beyond Convex Optimization - Stanford University

Potential Reduction Algorithm for LCOP

Consider the (concave+convex) Karmarkar potential function

φ(x) = ρ log(f (x)) −∑nj=1 log xj ,

where we assume that f (x) is nonnegative in the feasible region.We start from the analytic center x0 of the feasible region–Fp , sothat if

φ(xk)− φ(x0) ≤ ρ log ε, (4)

f (xk)

f (x0)≤ ε;

which implies that xk is an ε-global minimizer.

Yinyu Ye ICCOPT 2013

Page 44: Complexity Analysis beyond Convex Optimization - Stanford University

Quadratic Over-Estimate of Potential Function I

Consider

f (x) = q(x) =1

2xTQx+ cT x.

Yinyu Ye ICCOPT 2013

Page 45: Complexity Analysis beyond Convex Optimization - Stanford University

Quadratic Over-Estimate of Potential Function I

Consider

f (x) = q(x) =1

2xTQx+ cT x.

Given 0 < x ∈ Fp , let Δ = q(x) and let dx , Adx = 0, be a vectorsuch that x+ := x+ dx > 0. Then the non-convex part

ρ log(q(x+))− ρ log(q(x))

= ρ log(Δ +1

2dTx Qdx + (Qx+ c)Tdx)− ρ log Δ

= ρ log(1 + (1

2dTx Qdx + (Qx+ c)Tdx)/Δ)

≤ ρ

Δ(1

2dTx Qdx + (Qx+ c)Tdx).

Yinyu Ye ICCOPT 2013

Page 46: Complexity Analysis beyond Convex Optimization - Stanford University

Quadratic Over-Estimate of Potential Function II

On the other hand, if ‖X−1dx‖ ≤ β < 1, the convex part

−n∑

j=1

log(x+j ) +n∑

j=1

log(xj ) ≤ −eTX−1dx +β2

2(1− β).

Yinyu Ye ICCOPT 2013

Page 47: Complexity Analysis beyond Convex Optimization - Stanford University

Quadratic Over-Estimate of Potential Function II

On the other hand, if ‖X−1dx‖ ≤ β < 1, the convex part

−n∑

j=1

log(x+j ) +n∑

j=1

log(xj ) ≤ −eTX−1dx +β2

2(1− β).

Thus, if ‖X−1dx‖ ≤ β < 1, x+ = x+ dx > 0 and

φ(x+)−φ(x) ≤ ρ

Δ(1

2dTx Qdx+(Qx+c−Δ

ρX−1e)Tdx)+

β2

2(1 − β).

(5)

Yinyu Ye ICCOPT 2013

Page 48: Complexity Analysis beyond Convex Optimization - Stanford University

A Ball-Constrained Quadratic Subproblem I

We solve the following problem at the kth iteration:

minimize 12d

Tx Qdx + (Qxk + c− Δk

ρ (X k)−1e)Tdx

subject to Adx = 0,

‖(X k)−1dx‖2 ≤ β2.

Yinyu Ye ICCOPT 2013

Page 49: Complexity Analysis beyond Convex Optimization - Stanford University

A Ball-Constrained Quadratic Subproblem I

We solve the following problem at the kth iteration:

minimize 12d

Tx Qdx + (Qxk + c− Δk

ρ (X k)−1e)Tdx

subject to Adx = 0,

‖(X k)−1dx‖2 ≤ β2.

Using the affine-scaling, this problem can be reduced to theball-constrained quadratic program, where the radius of the ball isβ.

Yinyu Ye ICCOPT 2013

Page 50: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analysis

Each iteration can either make a constant reduction of thepotential, or not.

In the latter case, the new iterate x+ becomes a second-orderε-KKT solution with a suitable choice of ρ.

Yinyu Ye ICCOPT 2013

Page 51: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analysis

Each iteration can either make a constant reduction of thepotential, or not.

In the latter case, the new iterate x+ becomes a second-orderε-KKT solution with a suitable choice of ρ.

TheoremLet β = 1

3 and ρ = 3q(x0)ε . Then the potential reduction

algorithm returns a second-order ε-KKT solution or global

minimizer in no more than O(q(x0)

ε log q(x0)ε ) iterations.

Yinyu Ye ICCOPT 2013

Page 52: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analysis

Each iteration can either make a constant reduction of thepotential, or not.

In the latter case, the new iterate x+ becomes a second-orderε-KKT solution with a suitable choice of ρ.

TheoremLet β = 1

3 and ρ = 3q(x0)ε . Then the potential reduction

algorithm returns a second-order ε-KKT solution or global

minimizer in no more than O(q(x0)

ε log q(x0)ε ) iterations.

This type of algorithm is called fully polynomial timeapproximation scheme.

Yinyu Ye ICCOPT 2013

Page 53: Complexity Analysis beyond Convex Optimization - Stanford University

The PRA for Concave Minimization

The case when f (x) being a concave function is even easier.

Yinyu Ye ICCOPT 2013

Page 54: Complexity Analysis beyond Convex Optimization - Stanford University

The PRA for Concave Minimization

The case when f (x) being a concave function is even easier.

Let x+ = x+ dx . Then, we have a linear over-estimate of thepotential function:

φ(x+)− φ(x) ≤(

ρ

f (x)∇f (x)T − eTX−1

)dx +

β2

2(1− β),

as long as ‖X−1dx‖ ≤ β.

Yinyu Ye ICCOPT 2013

Page 55: Complexity Analysis beyond Convex Optimization - Stanford University

The PRA for Concave Minimization

The case when f (x) being a concave function is even easier.

Let x+ = x+ dx . Then, we have a linear over-estimate of thepotential function:

φ(x+)− φ(x) ≤(

ρ

f (x)∇f (x)T − eTX−1

)dx +

β2

2(1− β),

as long as ‖X−1dx‖ ≤ β.

Let affine-scaling d′ = X−1dx . Then, one can solve:

z(d′) := Minimize(

ρf (x)∇f (x)TX − eT

)d′

Subject to AXd′ = 0‖d′‖2 ≤ β2.

Yinyu Ye ICCOPT 2013

Page 56: Complexity Analysis beyond Convex Optimization - Stanford University

Affine Scaling Direction

The optimal direction of the affine scaling sub-problem is given by

d′ =β

‖p(x)‖ · p(x),

where

p(x) = − (I − XAT (AX 2AT )−1AX) ( ρ

f (x)X∇f (x) − e)

= e− ρf (x)X

(∇f (x) − ATy).

Yinyu Ye ICCOPT 2013

Page 57: Complexity Analysis beyond Convex Optimization - Stanford University

Affine Scaling Direction

The optimal direction of the affine scaling sub-problem is given by

d′ =β

‖p(x)‖ · p(x),

where

p(x) = − (I − XAT (AX 2AT )−1AX) ( ρ

f (x)X∇f (x) − e)

= e− ρf (x)X

(∇f (x) − ATy).

And the minimal value of the sub-problem

z(d′) = −β · ‖p(x)‖.

Yinyu Ye ICCOPT 2013

Page 58: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analysis I

If ‖p(x)‖ ≥ 1, then the minimal objective value of the affinescaling sub-problem is less than β so that

φ(x+)− φ(x) < −β +β2

2(1− β).

Thus, the potential value is reduced by a constant for choosing asuitable β.

Yinyu Ye ICCOPT 2013

Page 59: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analysis I

If ‖p(x)‖ ≥ 1, then the minimal objective value of the affinescaling sub-problem is less than β so that

φ(x+)− φ(x) < −β +β2

2(1− β).

Thus, the potential value is reduced by a constant for choosing asuitable β.

If this case would hold for O(ρ log f (x0)ε ) iterations, we would have

produced an ε-global minimizer of LCOP.

Yinyu Ye ICCOPT 2013

Page 60: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analysis II

On the other hand, if ‖p(x)‖ < 1, from

p(x) = e− ρ

f (x)X(∇f (x)− ATy

),

we must haveρ

f (x)X(∇f (x)− ATy

)≥ 0

andρ

f (x)X(∇f (x)− ATy

)≤ 2e.

Yinyu Ye ICCOPT 2013

Page 61: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analysis III

In other words (∇f (x)− ATy

)≥ 0

and

xj

(∇f (x) − ATy

)j<

2f (x)

ρ, ∀j .

Yinyu Ye ICCOPT 2013

Page 62: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analysis III

In other words (∇f (x)− ATy

)≥ 0

and

xj

(∇f (x) − ATy

)j<

2f (x)

ρ, ∀j .

The first condition indicate that the Lagrange multiplier y is valid,and the second inequality implies that the complementaritycondition is approximately satisfied when ρ chosen sufficientlylarge.

Yinyu Ye ICCOPT 2013

Page 63: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analyses IV

In particular, if we choose ρ ≥ 2f (x0)ε , then

‖X (∇f (x)− ATy)‖∞ ≤ ε,

which implies that x is a first-order ε-KKT solution.

Yinyu Ye ICCOPT 2013

Page 64: Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analyses IV

In particular, if we choose ρ ≥ 2f (x0)ε , then

‖X (∇f (x)− ATy)‖∞ ≤ ε,

which implies that x is a first-order ε-KKT solution.

TheoremThe algorithm then will provably return a first-order ε-KKT

solution of LCOP in no more than O( f (x0)ε log f (x0)

ε ) iterations forany given ε < 1, if the objective function is concave.

Yinyu Ye ICCOPT 2013

Page 65: Complexity Analysis beyond Convex Optimization - Stanford University

More Applications and Questions

� Could the time bound be further improved for QP or concaveminimization?

Yinyu Ye ICCOPT 2013

Page 66: Complexity Analysis beyond Convex Optimization - Stanford University

More Applications and Questions

� Could the time bound be further improved for QP or concaveminimization?

� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?

Yinyu Ye ICCOPT 2013

Page 67: Complexity Analysis beyond Convex Optimization - Stanford University

More Applications and Questions

� Could the time bound be further improved for QP or concaveminimization?

� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?

� We are not even able to prove the O(ε−1 log ε−1) bound when

f (x) = q(x) + ‖x‖pp.

Yinyu Ye ICCOPT 2013

Page 68: Complexity Analysis beyond Convex Optimization - Stanford University

More Applications and Questions

� Could the time bound be further improved for QP or concaveminimization?

� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?

� We are not even able to prove the O(ε−1 log ε−1) bound when

f (x) = q(x) + ‖x‖pp.

� More structural properties on the final KKT solution.

Yinyu Ye ICCOPT 2013

Page 69: Complexity Analysis beyond Convex Optimization - Stanford University

More Applications and Questions

� Could the time bound be further improved for QP or concaveminimization?

� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?

� We are not even able to prove the O(ε−1 log ε−1) bound when

f (x) = q(x) + ‖x‖pp.

� More structural properties on the final KKT solution.

� Applications to general sparse solution optimization, such asthe cardinality constrained portfolio selection, sparse pricingreduction for revenue management, etc..

Yinyu Ye ICCOPT 2013