Complexity Analysis beyond Convex Optimization - Stanford University

Post on 10-Feb-2022

16 views 0 download

Transcript of Complexity Analysis beyond Convex Optimization - Stanford University

Complexity Analysis beyond Convex Optimization

Yinyu Ye

K. T. Li Professor of EngineeringDepartment of Management Science and Engineering

Stanford University

http://www.stanford.edu/˜yyye

August 1, 2013

Yinyu Ye ICCOPT 2013

Outline

� Application arisen from Non-Convex Regularization

� Theory of the Lp-norm Regularization

� Selected Complexity Results for Non-Convex Optimization

� High-Level Complexity Analyses for Few cases

� Open Questions

Yinyu Ye ICCOPT 2013

Unconstrained L2+Lp Minimization

Consider the problem:

Minimizex∈Rn f2p(x) := ‖Ax− b‖22 + λ‖x‖pp (1)

where data A ∈ Rm×n,b ∈ Rm, parameter 0 ≤ p ≤ 1, and

‖x‖pp =n∑

j=1

|xj |p .

Yinyu Ye ICCOPT 2013

Unconstrained L2+Lp Minimization

Consider the problem:

Minimizex∈Rn f2p(x) := ‖Ax− b‖22 + λ‖x‖pp (1)

where data A ∈ Rm×n,b ∈ Rm, parameter 0 ≤ p ≤ 1, and

‖x‖pp =n∑

j=1

|xj |p .

‖x‖0 := |{j : xj �= 0}|that is, the number of nonzero entries in x.

Yinyu Ye ICCOPT 2013

Unconstrained L2+Lp Minimization

Consider the problem:

Minimizex∈Rn f2p(x) := ‖Ax− b‖22 + λ‖x‖pp (1)

where data A ∈ Rm×n,b ∈ Rm, parameter 0 ≤ p ≤ 1, and

‖x‖pp =n∑

j=1

|xj |p .

‖x‖0 := |{j : xj �= 0}|that is, the number of nonzero entries in x.

A more general model: for q ≥ 1

Minimizex∈Rn fqp(x) := ‖Ax− b‖qq + λ‖x‖pp

Yinyu Ye ICCOPT 2013

Constrained Lp Minimization

Consider another problem:

Minimize∑

1≤j≤n

xpj

Subject to Ax = b,x ≥ 0,

(2)

Yinyu Ye ICCOPT 2013

Constrained Lp Minimization

Consider another problem:

Minimize∑

1≤j≤n

xpj

Subject to Ax = b,x ≥ 0,

(2)

orMinimize

∑1≤j≤n

|xj |p

Subject to Ax = b.(3)

Yinyu Ye ICCOPT 2013

Application and Motivation

The original goal is to minimize ‖x‖0 = |{j : xj �= 0}|, the size ofthe support set of x, such that Ax = b, for

� Sparse image reconstruction

� Sparse signal recovering

� Sensor network localization

which is known to be an NP-Hard problem.

Yinyu Ye ICCOPT 2013

Approximation of ‖x‖0

� ‖x‖1 has been used to approximate ‖x‖0, and theregularization can be exact under certain strong conditions(Donoho 2004, Candes and Tao 2005, etc). Thisregularization model is actually a linear program.

Yinyu Ye ICCOPT 2013

Approximation of ‖x‖0

� ‖x‖1 has been used to approximate ‖x‖0, and theregularization can be exact under certain strong conditions(Donoho 2004, Candes and Tao 2005, etc). Thisregularization model is actually a linear program.

� Theoretical and empirical computational results indicate that‖x‖p regularization, say p = .5, have better performancesunder weaker conditions, and it is solvable equally efficientlyin practice (Chartrand 2009, Xu et al. 2009, etc).

Yinyu Ye ICCOPT 2013

The Hardness of Lp (Ge et al. 2011, Chen et al. 2012)

Question: is L2 + Lp minimization easier than L2 + L0minimization?

Yinyu Ye ICCOPT 2013

The Hardness of Lp (Ge et al. 2011, Chen et al. 2012)

Question: is L2 + Lp minimization easier than L2 + L0minimization?

TheoremDecide the global minimum of optimization problem Lq + Lp isstrongly NP-hard for any given 0 ≤ p < 1, q ≥ 1 and λ > 0.

Yinyu Ye ICCOPT 2013

The Hardness of Lp (Ge et al. 2011, Chen et al. 2012)

Question: is L2 + Lp minimization easier than L2 + L0minimization?

TheoremDecide the global minimum of optimization problem Lq + Lp isstrongly NP-hard for any given 0 ≤ p < 1, q ≥ 1 and λ > 0.

Nevertheless, practitioners solve them using non-linear solvers tocompute an KKT solution...

Yinyu Ye ICCOPT 2013

Recover Result: L0.5-Norm vs. L1 Norm

0 2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Sparsity

Fre

quen

cy o

f Suc

cess

Gaussian Random Matrices

L

1

Lp

Figure: Successful sparse recovery rates of L0 5 and L1 solutions, withYinyu Ye ICCOPT 2013

Sensor Network Localization

Given a graph G = (V ,E ) and sets of non–negative weights, say{dij : (i , j) ∈ E}, the goal is to compute a realization of G in theEuclidean space Rd for a given low dimension d , i.e.

� to place the vertexes of G in Rd such that

� the Euclidean distance between a pair of adjacent vertexes(i , j) equals to (or bounded by) the prescribed weight dij ∈ E .

Yinyu Ye ICCOPT 2013

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

Figure: 50-node 2-D Sensor Localization

Yinyu Ye ICCOPT 2013

Application to SNL

SNL-SDP:

minimize∑

(i ,j)∈E α2ij

subject to (ei − ej )(ei − ej)T • Y = d2

ij + αij , ∀ (i , j) ∈ E ,

Y � 0.

Yinyu Ye ICCOPT 2013

Application to SNL

SNL-SDP:

minimize∑

(i ,j)∈E α2ij

subject to (ei − ej )(ei − ej)T • Y = d2

ij + αij , ∀ (i , j) ∈ E ,

Y � 0.

Regularized SNL-SDP:

minimize∑

(i ,j)∈E α2ij + λ‖Y ‖p

subject to (ei − ej )(ei − ej)T • Y = d2

ij + αij , ∀ (i , j) ∈ E ,

Y � 0.

Yinyu Ye ICCOPT 2013

Schatten p-norm (Ji et al. 2013)

For any given symmetric matrix Y ∈ Sn,

‖Y ‖p =

⎛⎝∑

j

|λ(Y )j |p⎞⎠

1/p

, 0 < p ≤ 1

is known as the Schatten p-quasi-norm of Y .

Yinyu Ye ICCOPT 2013

Schatten p-norm (Ji et al. 2013)

For any given symmetric matrix Y ∈ Sn,

‖Y ‖p =

⎛⎝∑

j

|λ(Y )j |p⎞⎠

1/p

, 0 < p ≤ 1

is known as the Schatten p-quasi-norm of Y .

When p = 1, it is called Nuclear norm.

Yinyu Ye ICCOPT 2013

Schatten p-norm (Ji et al. 2013)

For any given symmetric matrix Y ∈ Sn,

‖Y ‖p =

⎛⎝∑

j

|λ(Y )j |p⎞⎠

1/p

, 0 < p ≤ 1

is known as the Schatten p-quasi-norm of Y .

When p = 1, it is called Nuclear norm.

The Schatten pquasinorm has several nice analytical propertiesthat make it a natural candidate for a regularizer.

Yinyu Ye ICCOPT 2013

Recover Result: Schatten 0.5-Norm vs. Nuclear Norm

−0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

recovered sensorexact sensorAnchor

−0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

recovered sensorexact sensorAnchor

Yinyu Ye ICCOPT 2013

0 25 50 75 100 125 1500

20

40

60

80

100

Extra Edges Added

Re

co

ve

r C

ase

s

Exactly Recover Cases (50 sensors)

Trace(Y)

Trace(Yp)

Yinyu Ye ICCOPT 2013

Theory of L2+Lp Minimization I

Theorem(The first order bound) Let x∗ be any first-order KKT point andlet

Li =

(λp

2‖ai‖√

f2p(x∗)

) 11−p

.

Then

for any i ∈ N , x∗i ∈ (−Li , Li ) ⇒ x∗i = 0.

Yinyu Ye ICCOPT 2013

Theory of L2+Lp Minimization I

Theorem(The first order bound) Let x∗ be any first-order KKT point andlet

Li =

(λp

2‖ai‖√

f2p(x∗)

) 11−p

.

Then

for any i ∈ N , x∗i ∈ (−Li , Li ) ⇒ x∗i = 0.

“Lower Bound Theory of Nonzero Entries in Solutions of L2-LpMinimization” (Chen, Xu and Y), SIAM J. Scientific Computing32:5 (2010) 2832-2852.

Yinyu Ye ICCOPT 2013

Theory of L2+Lp Minimization II

Theorem(The second order bound) Let x∗ be any second-order KKT point

and let Li =

(λp(1− p)

2‖ai‖2) 1

2−p

, i ∈ N . Then

(1)for any i ∈ N , x∗i ∈ (−Li , Li ) ⇒ x∗i = 0.

(2) The support columns of x∗ are linearly independent.

Yinyu Ye ICCOPT 2013

More Theoretical Developments ...

Markowitz Portfolio Model:

minimize 12x

TQx+ cT xsubject to eT x = 1, x ≥ 0

Yinyu Ye ICCOPT 2013

More Theoretical Developments ...

Markowitz Portfolio Model:

minimize 12x

TQx+ cT xsubject to eT x = 1, x ≥ 0

Regularized Markowitz Portfolio Model:

minimize 12x

TQx+ cT x+ λ‖x‖ppsubject to eT x = 1, x ≥ 0

Yinyu Ye ICCOPT 2013

Linearly Constrained Optimization Problem

(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.

Yinyu Ye ICCOPT 2013

Linearly Constrained Optimization Problem

(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.

The first-order KKT conditions:

xj(∇f (x) − ATy)j = 0, ∀jAx = b∇f (x)− ATy ≥ 0, x ≥ 0.

Yinyu Ye ICCOPT 2013

Linearly Constrained Optimization Problem

(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.

The first-order KKT conditions:

xj(∇f (x) − ATy)j = 0, ∀jAx = b∇f (x)− ATy ≥ 0, x ≥ 0.

First-order ε-KKT solution: |xj(∇f (x)− ATy)j | ≤ ε for all j .

Yinyu Ye ICCOPT 2013

Linearly Constrained Optimization Problem

(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.

The first-order KKT conditions:

xj(∇f (x) − ATy)j = 0, ∀jAx = b∇f (x)− ATy ≥ 0, x ≥ 0.

First-order ε-KKT solution: |xj(∇f (x)− ATy)j | ≤ ε for all j .Second-order ε-KKT solution if additionally the Hessian in the nullspace of active constraints is ε-positive-semidefinite.

Yinyu Ye ICCOPT 2013

Iteration Bound for an ε-KKT Solution

Smooth Lipschitz Non-Lipschitzlog log(ε−1) [Y 1992] Ball-IQPε−1 log(ε−1) [Y 1998] IQP [Ge et al 2011]

Constrained Lpε−3/2 [Nesterov et al

2006];[Bian et al2012]

[Cartis et al 2011]ε−2 [Vavasis 1991],

[Nesterov 2004];[Vavasis 1991] [Bian et al

2012];[Gratton et al2008]

[Cartis et al2011]

[Bian et al2012]

ε−3 log(ε−1) [Garmanjani etal 2012]

Table: Selected worst-case complexity results for nonconvex optimization

Yinyu Ye ICCOPT 2013

Ball or Sphere-Constrained Indefinite QP

(BQP)minimize 1

2xTQx+ cT x

subject to ‖x‖2 = (≤)1.

Yinyu Ye ICCOPT 2013

Ball or Sphere-Constrained Indefinite QP

(BQP)minimize 1

2xTQx+ cT x

subject to ‖x‖2 = (≤)1.

The solution x of problem (BQP) satisfies the following necessaryand sufficient conditions (S-Lemma):

(Q + μI )x = −c,(Q + μI ) � 0,

and ‖x‖ = 1.

Yinyu Ye ICCOPT 2013

Ball or Sphere-Constrained Indefinite QP

(BQP)minimize 1

2xTQx+ cT x

subject to ‖x‖2 = (≤)1.

The solution x of problem (BQP) satisfies the following necessaryand sufficient conditions (S-Lemma):

(Q + μI )x = −c,(Q + μI ) � 0,

and ‖x‖ = 1.

This is an SDP problem, and the simplest trust-region sub-problem(More, Sorensen, Dennis and Schnabel, etc. 1980).

Yinyu Ye ICCOPT 2013

The Bisection Method

For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to

(Q + μI )x = −c.

Yinyu Ye ICCOPT 2013

The Bisection Method

For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to

(Q + μI )x = −c.

Assume ‖x(−λ(Q))‖ > 1 (the othe case can be handled easily)and note

μ ≤ −λ(Q) + ‖c‖.

Yinyu Ye ICCOPT 2013

The Bisection Method

For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to

(Q + μI )x = −c.

Assume ‖x(−λ(Q))‖ > 1 (the othe case can be handled easily)and note

μ ≤ −λ(Q) + ‖c‖.Thus, one can apply the bisection method to find the right μ andsolve the problem in polynomial-time log(ε−1) steps.

Yinyu Ye ICCOPT 2013

The Bisection Method

For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to

(Q + μI )x = −c.

Assume ‖x(−λ(Q))‖ > 1 (the othe case can be handled easily)and note

μ ≤ −λ(Q) + ‖c‖.Thus, one can apply the bisection method to find the right μ andsolve the problem in polynomial-time log(ε−1) steps.

We can do it in log-polynomial time using the Steve Smale 1986work on Newton’s method ...

Yinyu Ye ICCOPT 2013

Combined Bisection and Newton’s Method

Yinyu Ye ICCOPT 2013

Potential Reduction Algorithm for LCOP

Consider the (concave+convex) Karmarkar potential function

φ(x) = ρ log(f (x)) −∑nj=1 log xj ,

where we assume that f (x) is nonnegative in the feasible region.

Yinyu Ye ICCOPT 2013

Potential Reduction Algorithm for LCOP

Consider the (concave+convex) Karmarkar potential function

φ(x) = ρ log(f (x)) −∑nj=1 log xj ,

where we assume that f (x) is nonnegative in the feasible region.We start from the analytic center x0 of the feasible region–Fp , sothat if

φ(xk)− φ(x0) ≤ ρ log ε, (4)

f (xk)

f (x0)≤ ε;

which implies that xk is an ε-global minimizer.

Yinyu Ye ICCOPT 2013

Quadratic Over-Estimate of Potential Function I

Consider

f (x) = q(x) =1

2xTQx+ cT x.

Yinyu Ye ICCOPT 2013

Quadratic Over-Estimate of Potential Function I

Consider

f (x) = q(x) =1

2xTQx+ cT x.

Given 0 < x ∈ Fp , let Δ = q(x) and let dx , Adx = 0, be a vectorsuch that x+ := x+ dx > 0. Then the non-convex part

ρ log(q(x+))− ρ log(q(x))

= ρ log(Δ +1

2dTx Qdx + (Qx+ c)Tdx)− ρ log Δ

= ρ log(1 + (1

2dTx Qdx + (Qx+ c)Tdx)/Δ)

≤ ρ

Δ(1

2dTx Qdx + (Qx+ c)Tdx).

Yinyu Ye ICCOPT 2013

Quadratic Over-Estimate of Potential Function II

On the other hand, if ‖X−1dx‖ ≤ β < 1, the convex part

−n∑

j=1

log(x+j ) +n∑

j=1

log(xj ) ≤ −eTX−1dx +β2

2(1− β).

Yinyu Ye ICCOPT 2013

Quadratic Over-Estimate of Potential Function II

On the other hand, if ‖X−1dx‖ ≤ β < 1, the convex part

−n∑

j=1

log(x+j ) +n∑

j=1

log(xj ) ≤ −eTX−1dx +β2

2(1− β).

Thus, if ‖X−1dx‖ ≤ β < 1, x+ = x+ dx > 0 and

φ(x+)−φ(x) ≤ ρ

Δ(1

2dTx Qdx+(Qx+c−Δ

ρX−1e)Tdx)+

β2

2(1 − β).

(5)

Yinyu Ye ICCOPT 2013

A Ball-Constrained Quadratic Subproblem I

We solve the following problem at the kth iteration:

minimize 12d

Tx Qdx + (Qxk + c− Δk

ρ (X k)−1e)Tdx

subject to Adx = 0,

‖(X k)−1dx‖2 ≤ β2.

Yinyu Ye ICCOPT 2013

A Ball-Constrained Quadratic Subproblem I

We solve the following problem at the kth iteration:

minimize 12d

Tx Qdx + (Qxk + c− Δk

ρ (X k)−1e)Tdx

subject to Adx = 0,

‖(X k)−1dx‖2 ≤ β2.

Using the affine-scaling, this problem can be reduced to theball-constrained quadratic program, where the radius of the ball isβ.

Yinyu Ye ICCOPT 2013

Complexity Analysis

Each iteration can either make a constant reduction of thepotential, or not.

In the latter case, the new iterate x+ becomes a second-orderε-KKT solution with a suitable choice of ρ.

Yinyu Ye ICCOPT 2013

Complexity Analysis

Each iteration can either make a constant reduction of thepotential, or not.

In the latter case, the new iterate x+ becomes a second-orderε-KKT solution with a suitable choice of ρ.

TheoremLet β = 1

3 and ρ = 3q(x0)ε . Then the potential reduction

algorithm returns a second-order ε-KKT solution or global

minimizer in no more than O(q(x0)

ε log q(x0)ε ) iterations.

Yinyu Ye ICCOPT 2013

Complexity Analysis

Each iteration can either make a constant reduction of thepotential, or not.

In the latter case, the new iterate x+ becomes a second-orderε-KKT solution with a suitable choice of ρ.

TheoremLet β = 1

3 and ρ = 3q(x0)ε . Then the potential reduction

algorithm returns a second-order ε-KKT solution or global

minimizer in no more than O(q(x0)

ε log q(x0)ε ) iterations.

This type of algorithm is called fully polynomial timeapproximation scheme.

Yinyu Ye ICCOPT 2013

The PRA for Concave Minimization

The case when f (x) being a concave function is even easier.

Yinyu Ye ICCOPT 2013

The PRA for Concave Minimization

The case when f (x) being a concave function is even easier.

Let x+ = x+ dx . Then, we have a linear over-estimate of thepotential function:

φ(x+)− φ(x) ≤(

ρ

f (x)∇f (x)T − eTX−1

)dx +

β2

2(1− β),

as long as ‖X−1dx‖ ≤ β.

Yinyu Ye ICCOPT 2013

The PRA for Concave Minimization

The case when f (x) being a concave function is even easier.

Let x+ = x+ dx . Then, we have a linear over-estimate of thepotential function:

φ(x+)− φ(x) ≤(

ρ

f (x)∇f (x)T − eTX−1

)dx +

β2

2(1− β),

as long as ‖X−1dx‖ ≤ β.

Let affine-scaling d′ = X−1dx . Then, one can solve:

z(d′) := Minimize(

ρf (x)∇f (x)TX − eT

)d′

Subject to AXd′ = 0‖d′‖2 ≤ β2.

Yinyu Ye ICCOPT 2013

Affine Scaling Direction

The optimal direction of the affine scaling sub-problem is given by

d′ =β

‖p(x)‖ · p(x),

where

p(x) = − (I − XAT (AX 2AT )−1AX) ( ρ

f (x)X∇f (x) − e)

= e− ρf (x)X

(∇f (x) − ATy).

Yinyu Ye ICCOPT 2013

Affine Scaling Direction

The optimal direction of the affine scaling sub-problem is given by

d′ =β

‖p(x)‖ · p(x),

where

p(x) = − (I − XAT (AX 2AT )−1AX) ( ρ

f (x)X∇f (x) − e)

= e− ρf (x)X

(∇f (x) − ATy).

And the minimal value of the sub-problem

z(d′) = −β · ‖p(x)‖.

Yinyu Ye ICCOPT 2013

Complexity Analysis I

If ‖p(x)‖ ≥ 1, then the minimal objective value of the affinescaling sub-problem is less than β so that

φ(x+)− φ(x) < −β +β2

2(1− β).

Thus, the potential value is reduced by a constant for choosing asuitable β.

Yinyu Ye ICCOPT 2013

Complexity Analysis I

If ‖p(x)‖ ≥ 1, then the minimal objective value of the affinescaling sub-problem is less than β so that

φ(x+)− φ(x) < −β +β2

2(1− β).

Thus, the potential value is reduced by a constant for choosing asuitable β.

If this case would hold for O(ρ log f (x0)ε ) iterations, we would have

produced an ε-global minimizer of LCOP.

Yinyu Ye ICCOPT 2013

Complexity Analysis II

On the other hand, if ‖p(x)‖ < 1, from

p(x) = e− ρ

f (x)X(∇f (x)− ATy

),

we must haveρ

f (x)X(∇f (x)− ATy

)≥ 0

andρ

f (x)X(∇f (x)− ATy

)≤ 2e.

Yinyu Ye ICCOPT 2013

Complexity Analysis III

In other words (∇f (x)− ATy

)≥ 0

and

xj

(∇f (x) − ATy

)j<

2f (x)

ρ, ∀j .

Yinyu Ye ICCOPT 2013

Complexity Analysis III

In other words (∇f (x)− ATy

)≥ 0

and

xj

(∇f (x) − ATy

)j<

2f (x)

ρ, ∀j .

The first condition indicate that the Lagrange multiplier y is valid,and the second inequality implies that the complementaritycondition is approximately satisfied when ρ chosen sufficientlylarge.

Yinyu Ye ICCOPT 2013

Complexity Analyses IV

In particular, if we choose ρ ≥ 2f (x0)ε , then

‖X (∇f (x)− ATy)‖∞ ≤ ε,

which implies that x is a first-order ε-KKT solution.

Yinyu Ye ICCOPT 2013

Complexity Analyses IV

In particular, if we choose ρ ≥ 2f (x0)ε , then

‖X (∇f (x)− ATy)‖∞ ≤ ε,

which implies that x is a first-order ε-KKT solution.

TheoremThe algorithm then will provably return a first-order ε-KKT

solution of LCOP in no more than O( f (x0)ε log f (x0)

ε ) iterations forany given ε < 1, if the objective function is concave.

Yinyu Ye ICCOPT 2013

More Applications and Questions

� Could the time bound be further improved for QP or concaveminimization?

Yinyu Ye ICCOPT 2013

More Applications and Questions

� Could the time bound be further improved for QP or concaveminimization?

� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?

Yinyu Ye ICCOPT 2013

More Applications and Questions

� Could the time bound be further improved for QP or concaveminimization?

� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?

� We are not even able to prove the O(ε−1 log ε−1) bound when

f (x) = q(x) + ‖x‖pp.

Yinyu Ye ICCOPT 2013

More Applications and Questions

� Could the time bound be further improved for QP or concaveminimization?

� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?

� We are not even able to prove the O(ε−1 log ε−1) bound when

f (x) = q(x) + ‖x‖pp.

� More structural properties on the final KKT solution.

Yinyu Ye ICCOPT 2013

More Applications and Questions

� Could the time bound be further improved for QP or concaveminimization?

� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?

� We are not even able to prove the O(ε−1 log ε−1) bound when

f (x) = q(x) + ‖x‖pp.

� More structural properties on the final KKT solution.

� Applications to general sparse solution optimization, such asthe cardinality constrained portfolio selection, sparse pricingreduction for revenue management, etc..

Yinyu Ye ICCOPT 2013