Complexity Analysis beyond Convex Optimization - Stanford University
Transcript of Complexity Analysis beyond Convex Optimization - Stanford University
Complexity Analysis beyond Convex Optimization
Yinyu Ye
K. T. Li Professor of EngineeringDepartment of Management Science and Engineering
Stanford University
http://www.stanford.edu/˜yyye
August 1, 2013
Yinyu Ye ICCOPT 2013
Outline
� Application arisen from Non-Convex Regularization
� Theory of the Lp-norm Regularization
� Selected Complexity Results for Non-Convex Optimization
� High-Level Complexity Analyses for Few cases
� Open Questions
Yinyu Ye ICCOPT 2013
Unconstrained L2+Lp Minimization
Consider the problem:
Minimizex∈Rn f2p(x) := ‖Ax− b‖22 + λ‖x‖pp (1)
where data A ∈ Rm×n,b ∈ Rm, parameter 0 ≤ p ≤ 1, and
‖x‖pp =n∑
j=1
|xj |p .
Yinyu Ye ICCOPT 2013
Unconstrained L2+Lp Minimization
Consider the problem:
Minimizex∈Rn f2p(x) := ‖Ax− b‖22 + λ‖x‖pp (1)
where data A ∈ Rm×n,b ∈ Rm, parameter 0 ≤ p ≤ 1, and
‖x‖pp =n∑
j=1
|xj |p .
‖x‖0 := |{j : xj �= 0}|that is, the number of nonzero entries in x.
Yinyu Ye ICCOPT 2013
Unconstrained L2+Lp Minimization
Consider the problem:
Minimizex∈Rn f2p(x) := ‖Ax− b‖22 + λ‖x‖pp (1)
where data A ∈ Rm×n,b ∈ Rm, parameter 0 ≤ p ≤ 1, and
‖x‖pp =n∑
j=1
|xj |p .
‖x‖0 := |{j : xj �= 0}|that is, the number of nonzero entries in x.
A more general model: for q ≥ 1
Minimizex∈Rn fqp(x) := ‖Ax− b‖qq + λ‖x‖pp
Yinyu Ye ICCOPT 2013
Constrained Lp Minimization
Consider another problem:
Minimize∑
1≤j≤n
xpj
Subject to Ax = b,x ≥ 0,
(2)
Yinyu Ye ICCOPT 2013
Constrained Lp Minimization
Consider another problem:
Minimize∑
1≤j≤n
xpj
Subject to Ax = b,x ≥ 0,
(2)
orMinimize
∑1≤j≤n
|xj |p
Subject to Ax = b.(3)
Yinyu Ye ICCOPT 2013
Application and Motivation
The original goal is to minimize ‖x‖0 = |{j : xj �= 0}|, the size ofthe support set of x, such that Ax = b, for
� Sparse image reconstruction
� Sparse signal recovering
� Sensor network localization
which is known to be an NP-Hard problem.
Yinyu Ye ICCOPT 2013
Approximation of ‖x‖0
� ‖x‖1 has been used to approximate ‖x‖0, and theregularization can be exact under certain strong conditions(Donoho 2004, Candes and Tao 2005, etc). Thisregularization model is actually a linear program.
Yinyu Ye ICCOPT 2013
Approximation of ‖x‖0
� ‖x‖1 has been used to approximate ‖x‖0, and theregularization can be exact under certain strong conditions(Donoho 2004, Candes and Tao 2005, etc). Thisregularization model is actually a linear program.
� Theoretical and empirical computational results indicate that‖x‖p regularization, say p = .5, have better performancesunder weaker conditions, and it is solvable equally efficientlyin practice (Chartrand 2009, Xu et al. 2009, etc).
Yinyu Ye ICCOPT 2013
The Hardness of Lp (Ge et al. 2011, Chen et al. 2012)
Question: is L2 + Lp minimization easier than L2 + L0minimization?
Yinyu Ye ICCOPT 2013
The Hardness of Lp (Ge et al. 2011, Chen et al. 2012)
Question: is L2 + Lp minimization easier than L2 + L0minimization?
TheoremDecide the global minimum of optimization problem Lq + Lp isstrongly NP-hard for any given 0 ≤ p < 1, q ≥ 1 and λ > 0.
Yinyu Ye ICCOPT 2013
The Hardness of Lp (Ge et al. 2011, Chen et al. 2012)
Question: is L2 + Lp minimization easier than L2 + L0minimization?
TheoremDecide the global minimum of optimization problem Lq + Lp isstrongly NP-hard for any given 0 ≤ p < 1, q ≥ 1 and λ > 0.
Nevertheless, practitioners solve them using non-linear solvers tocompute an KKT solution...
Yinyu Ye ICCOPT 2013
Recover Result: L0.5-Norm vs. L1 Norm
0 2 4 6 8 10 12 14 16 18 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Sparsity
Fre
quen
cy o
f Suc
cess
Gaussian Random Matrices
L
1
Lp
Figure: Successful sparse recovery rates of L0 5 and L1 solutions, withYinyu Ye ICCOPT 2013
Sensor Network Localization
Given a graph G = (V ,E ) and sets of non–negative weights, say{dij : (i , j) ∈ E}, the goal is to compute a realization of G in theEuclidean space Rd for a given low dimension d , i.e.
� to place the vertexes of G in Rd such that
� the Euclidean distance between a pair of adjacent vertexes(i , j) equals to (or bounded by) the prescribed weight dij ∈ E .
Yinyu Ye ICCOPT 2013
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
Figure: 50-node 2-D Sensor Localization
Yinyu Ye ICCOPT 2013
Application to SNL
SNL-SDP:
minimize∑
(i ,j)∈E α2ij
subject to (ei − ej )(ei − ej)T • Y = d2
ij + αij , ∀ (i , j) ∈ E ,
Y � 0.
Yinyu Ye ICCOPT 2013
Application to SNL
SNL-SDP:
minimize∑
(i ,j)∈E α2ij
subject to (ei − ej )(ei − ej)T • Y = d2
ij + αij , ∀ (i , j) ∈ E ,
Y � 0.
Regularized SNL-SDP:
minimize∑
(i ,j)∈E α2ij + λ‖Y ‖p
subject to (ei − ej )(ei − ej)T • Y = d2
ij + αij , ∀ (i , j) ∈ E ,
Y � 0.
Yinyu Ye ICCOPT 2013
Schatten p-norm (Ji et al. 2013)
For any given symmetric matrix Y ∈ Sn,
‖Y ‖p =
⎛⎝∑
j
|λ(Y )j |p⎞⎠
1/p
, 0 < p ≤ 1
is known as the Schatten p-quasi-norm of Y .
Yinyu Ye ICCOPT 2013
Schatten p-norm (Ji et al. 2013)
For any given symmetric matrix Y ∈ Sn,
‖Y ‖p =
⎛⎝∑
j
|λ(Y )j |p⎞⎠
1/p
, 0 < p ≤ 1
is known as the Schatten p-quasi-norm of Y .
When p = 1, it is called Nuclear norm.
Yinyu Ye ICCOPT 2013
Schatten p-norm (Ji et al. 2013)
For any given symmetric matrix Y ∈ Sn,
‖Y ‖p =
⎛⎝∑
j
|λ(Y )j |p⎞⎠
1/p
, 0 < p ≤ 1
is known as the Schatten p-quasi-norm of Y .
When p = 1, it is called Nuclear norm.
The Schatten pquasinorm has several nice analytical propertiesthat make it a natural candidate for a regularizer.
Yinyu Ye ICCOPT 2013
Recover Result: Schatten 0.5-Norm vs. Nuclear Norm
−0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
recovered sensorexact sensorAnchor
−0.6 −0.4 −0.2 0 0.2 0.4 0.6−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
recovered sensorexact sensorAnchor
Yinyu Ye ICCOPT 2013
0 25 50 75 100 125 1500
20
40
60
80
100
Extra Edges Added
Re
co
ve
r C
ase
s
Exactly Recover Cases (50 sensors)
Trace(Y)
Trace(Yp)
Yinyu Ye ICCOPT 2013
Theory of L2+Lp Minimization I
Theorem(The first order bound) Let x∗ be any first-order KKT point andlet
Li =
(λp
2‖ai‖√
f2p(x∗)
) 11−p
.
Then
for any i ∈ N , x∗i ∈ (−Li , Li ) ⇒ x∗i = 0.
Yinyu Ye ICCOPT 2013
Theory of L2+Lp Minimization I
Theorem(The first order bound) Let x∗ be any first-order KKT point andlet
Li =
(λp
2‖ai‖√
f2p(x∗)
) 11−p
.
Then
for any i ∈ N , x∗i ∈ (−Li , Li ) ⇒ x∗i = 0.
“Lower Bound Theory of Nonzero Entries in Solutions of L2-LpMinimization” (Chen, Xu and Y), SIAM J. Scientific Computing32:5 (2010) 2832-2852.
Yinyu Ye ICCOPT 2013
Theory of L2+Lp Minimization II
Theorem(The second order bound) Let x∗ be any second-order KKT point
and let Li =
(λp(1− p)
2‖ai‖2) 1
2−p
, i ∈ N . Then
(1)for any i ∈ N , x∗i ∈ (−Li , Li ) ⇒ x∗i = 0.
(2) The support columns of x∗ are linearly independent.
Yinyu Ye ICCOPT 2013
More Theoretical Developments ...
Markowitz Portfolio Model:
minimize 12x
TQx+ cT xsubject to eT x = 1, x ≥ 0
Yinyu Ye ICCOPT 2013
More Theoretical Developments ...
Markowitz Portfolio Model:
minimize 12x
TQx+ cT xsubject to eT x = 1, x ≥ 0
Regularized Markowitz Portfolio Model:
minimize 12x
TQx+ cT x+ λ‖x‖ppsubject to eT x = 1, x ≥ 0
Yinyu Ye ICCOPT 2013
Linearly Constrained Optimization Problem
(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.
Yinyu Ye ICCOPT 2013
Linearly Constrained Optimization Problem
(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.
The first-order KKT conditions:
xj(∇f (x) − ATy)j = 0, ∀jAx = b∇f (x)− ATy ≥ 0, x ≥ 0.
Yinyu Ye ICCOPT 2013
Linearly Constrained Optimization Problem
(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.
The first-order KKT conditions:
xj(∇f (x) − ATy)j = 0, ∀jAx = b∇f (x)− ATy ≥ 0, x ≥ 0.
First-order ε-KKT solution: |xj(∇f (x)− ATy)j | ≤ ε for all j .
Yinyu Ye ICCOPT 2013
Linearly Constrained Optimization Problem
(LCOP)minimize f (x)subject to Ax = b, x ≥ 0.
The first-order KKT conditions:
xj(∇f (x) − ATy)j = 0, ∀jAx = b∇f (x)− ATy ≥ 0, x ≥ 0.
First-order ε-KKT solution: |xj(∇f (x)− ATy)j | ≤ ε for all j .Second-order ε-KKT solution if additionally the Hessian in the nullspace of active constraints is ε-positive-semidefinite.
Yinyu Ye ICCOPT 2013
Iteration Bound for an ε-KKT Solution
Smooth Lipschitz Non-Lipschitzlog log(ε−1) [Y 1992] Ball-IQPε−1 log(ε−1) [Y 1998] IQP [Ge et al 2011]
Constrained Lpε−3/2 [Nesterov et al
2006];[Bian et al2012]
[Cartis et al 2011]ε−2 [Vavasis 1991],
[Nesterov 2004];[Vavasis 1991] [Bian et al
2012];[Gratton et al2008]
[Cartis et al2011]
[Bian et al2012]
ε−3 log(ε−1) [Garmanjani etal 2012]
Table: Selected worst-case complexity results for nonconvex optimization
Yinyu Ye ICCOPT 2013
Ball or Sphere-Constrained Indefinite QP
(BQP)minimize 1
2xTQx+ cT x
subject to ‖x‖2 = (≤)1.
Yinyu Ye ICCOPT 2013
Ball or Sphere-Constrained Indefinite QP
(BQP)minimize 1
2xTQx+ cT x
subject to ‖x‖2 = (≤)1.
The solution x of problem (BQP) satisfies the following necessaryand sufficient conditions (S-Lemma):
(Q + μI )x = −c,(Q + μI ) � 0,
and ‖x‖ = 1.
Yinyu Ye ICCOPT 2013
Ball or Sphere-Constrained Indefinite QP
(BQP)minimize 1
2xTQx+ cT x
subject to ‖x‖2 = (≤)1.
The solution x of problem (BQP) satisfies the following necessaryand sufficient conditions (S-Lemma):
(Q + μI )x = −c,(Q + μI ) � 0,
and ‖x‖ = 1.
This is an SDP problem, and the simplest trust-region sub-problem(More, Sorensen, Dennis and Schnabel, etc. 1980).
Yinyu Ye ICCOPT 2013
The Bisection Method
For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to
(Q + μI )x = −c.
Yinyu Ye ICCOPT 2013
The Bisection Method
For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to
(Q + μI )x = −c.
Assume ‖x(−λ(Q))‖ > 1 (the othe case can be handled easily)and note
μ ≤ −λ(Q) + ‖c‖.
Yinyu Ye ICCOPT 2013
The Bisection Method
For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to
(Q + μI )x = −c.
Assume ‖x(−λ(Q))‖ > 1 (the othe case can be handled easily)and note
μ ≤ −λ(Q) + ‖c‖.Thus, one can apply the bisection method to find the right μ andsolve the problem in polynomial-time log(ε−1) steps.
Yinyu Ye ICCOPT 2013
The Bisection Method
For any μ > −λ(Q), where λ(Q) is the minimal eigenvalue of Q,denote by x(μ) the solution to
(Q + μI )x = −c.
Assume ‖x(−λ(Q))‖ > 1 (the othe case can be handled easily)and note
μ ≤ −λ(Q) + ‖c‖.Thus, one can apply the bisection method to find the right μ andsolve the problem in polynomial-time log(ε−1) steps.
We can do it in log-polynomial time using the Steve Smale 1986work on Newton’s method ...
Yinyu Ye ICCOPT 2013
Combined Bisection and Newton’s Method
Yinyu Ye ICCOPT 2013
Potential Reduction Algorithm for LCOP
Consider the (concave+convex) Karmarkar potential function
φ(x) = ρ log(f (x)) −∑nj=1 log xj ,
where we assume that f (x) is nonnegative in the feasible region.
Yinyu Ye ICCOPT 2013
Potential Reduction Algorithm for LCOP
Consider the (concave+convex) Karmarkar potential function
φ(x) = ρ log(f (x)) −∑nj=1 log xj ,
where we assume that f (x) is nonnegative in the feasible region.We start from the analytic center x0 of the feasible region–Fp , sothat if
φ(xk)− φ(x0) ≤ ρ log ε, (4)
f (xk)
f (x0)≤ ε;
which implies that xk is an ε-global minimizer.
Yinyu Ye ICCOPT 2013
Quadratic Over-Estimate of Potential Function I
Consider
f (x) = q(x) =1
2xTQx+ cT x.
Yinyu Ye ICCOPT 2013
Quadratic Over-Estimate of Potential Function I
Consider
f (x) = q(x) =1
2xTQx+ cT x.
Given 0 < x ∈ Fp , let Δ = q(x) and let dx , Adx = 0, be a vectorsuch that x+ := x+ dx > 0. Then the non-convex part
ρ log(q(x+))− ρ log(q(x))
= ρ log(Δ +1
2dTx Qdx + (Qx+ c)Tdx)− ρ log Δ
= ρ log(1 + (1
2dTx Qdx + (Qx+ c)Tdx)/Δ)
≤ ρ
Δ(1
2dTx Qdx + (Qx+ c)Tdx).
Yinyu Ye ICCOPT 2013
Quadratic Over-Estimate of Potential Function II
On the other hand, if ‖X−1dx‖ ≤ β < 1, the convex part
−n∑
j=1
log(x+j ) +n∑
j=1
log(xj ) ≤ −eTX−1dx +β2
2(1− β).
Yinyu Ye ICCOPT 2013
Quadratic Over-Estimate of Potential Function II
On the other hand, if ‖X−1dx‖ ≤ β < 1, the convex part
−n∑
j=1
log(x+j ) +n∑
j=1
log(xj ) ≤ −eTX−1dx +β2
2(1− β).
Thus, if ‖X−1dx‖ ≤ β < 1, x+ = x+ dx > 0 and
φ(x+)−φ(x) ≤ ρ
Δ(1
2dTx Qdx+(Qx+c−Δ
ρX−1e)Tdx)+
β2
2(1 − β).
(5)
Yinyu Ye ICCOPT 2013
A Ball-Constrained Quadratic Subproblem I
We solve the following problem at the kth iteration:
minimize 12d
Tx Qdx + (Qxk + c− Δk
ρ (X k)−1e)Tdx
subject to Adx = 0,
‖(X k)−1dx‖2 ≤ β2.
Yinyu Ye ICCOPT 2013
A Ball-Constrained Quadratic Subproblem I
We solve the following problem at the kth iteration:
minimize 12d
Tx Qdx + (Qxk + c− Δk
ρ (X k)−1e)Tdx
subject to Adx = 0,
‖(X k)−1dx‖2 ≤ β2.
Using the affine-scaling, this problem can be reduced to theball-constrained quadratic program, where the radius of the ball isβ.
Yinyu Ye ICCOPT 2013
Complexity Analysis
Each iteration can either make a constant reduction of thepotential, or not.
In the latter case, the new iterate x+ becomes a second-orderε-KKT solution with a suitable choice of ρ.
Yinyu Ye ICCOPT 2013
Complexity Analysis
Each iteration can either make a constant reduction of thepotential, or not.
In the latter case, the new iterate x+ becomes a second-orderε-KKT solution with a suitable choice of ρ.
TheoremLet β = 1
3 and ρ = 3q(x0)ε . Then the potential reduction
algorithm returns a second-order ε-KKT solution or global
minimizer in no more than O(q(x0)
ε log q(x0)ε ) iterations.
Yinyu Ye ICCOPT 2013
Complexity Analysis
Each iteration can either make a constant reduction of thepotential, or not.
In the latter case, the new iterate x+ becomes a second-orderε-KKT solution with a suitable choice of ρ.
TheoremLet β = 1
3 and ρ = 3q(x0)ε . Then the potential reduction
algorithm returns a second-order ε-KKT solution or global
minimizer in no more than O(q(x0)
ε log q(x0)ε ) iterations.
This type of algorithm is called fully polynomial timeapproximation scheme.
Yinyu Ye ICCOPT 2013
The PRA for Concave Minimization
The case when f (x) being a concave function is even easier.
Yinyu Ye ICCOPT 2013
The PRA for Concave Minimization
The case when f (x) being a concave function is even easier.
Let x+ = x+ dx . Then, we have a linear over-estimate of thepotential function:
φ(x+)− φ(x) ≤(
ρ
f (x)∇f (x)T − eTX−1
)dx +
β2
2(1− β),
as long as ‖X−1dx‖ ≤ β.
Yinyu Ye ICCOPT 2013
The PRA for Concave Minimization
The case when f (x) being a concave function is even easier.
Let x+ = x+ dx . Then, we have a linear over-estimate of thepotential function:
φ(x+)− φ(x) ≤(
ρ
f (x)∇f (x)T − eTX−1
)dx +
β2
2(1− β),
as long as ‖X−1dx‖ ≤ β.
Let affine-scaling d′ = X−1dx . Then, one can solve:
z(d′) := Minimize(
ρf (x)∇f (x)TX − eT
)d′
Subject to AXd′ = 0‖d′‖2 ≤ β2.
Yinyu Ye ICCOPT 2013
Affine Scaling Direction
The optimal direction of the affine scaling sub-problem is given by
d′ =β
‖p(x)‖ · p(x),
where
p(x) = − (I − XAT (AX 2AT )−1AX) ( ρ
f (x)X∇f (x) − e)
= e− ρf (x)X
(∇f (x) − ATy).
Yinyu Ye ICCOPT 2013
Affine Scaling Direction
The optimal direction of the affine scaling sub-problem is given by
d′ =β
‖p(x)‖ · p(x),
where
p(x) = − (I − XAT (AX 2AT )−1AX) ( ρ
f (x)X∇f (x) − e)
= e− ρf (x)X
(∇f (x) − ATy).
And the minimal value of the sub-problem
z(d′) = −β · ‖p(x)‖.
Yinyu Ye ICCOPT 2013
Complexity Analysis I
If ‖p(x)‖ ≥ 1, then the minimal objective value of the affinescaling sub-problem is less than β so that
φ(x+)− φ(x) < −β +β2
2(1− β).
Thus, the potential value is reduced by a constant for choosing asuitable β.
Yinyu Ye ICCOPT 2013
Complexity Analysis I
If ‖p(x)‖ ≥ 1, then the minimal objective value of the affinescaling sub-problem is less than β so that
φ(x+)− φ(x) < −β +β2
2(1− β).
Thus, the potential value is reduced by a constant for choosing asuitable β.
If this case would hold for O(ρ log f (x0)ε ) iterations, we would have
produced an ε-global minimizer of LCOP.
Yinyu Ye ICCOPT 2013
Complexity Analysis II
On the other hand, if ‖p(x)‖ < 1, from
p(x) = e− ρ
f (x)X(∇f (x)− ATy
),
we must haveρ
f (x)X(∇f (x)− ATy
)≥ 0
andρ
f (x)X(∇f (x)− ATy
)≤ 2e.
Yinyu Ye ICCOPT 2013
Complexity Analysis III
In other words (∇f (x)− ATy
)≥ 0
and
xj
(∇f (x) − ATy
)j<
2f (x)
ρ, ∀j .
Yinyu Ye ICCOPT 2013
Complexity Analysis III
In other words (∇f (x)− ATy
)≥ 0
and
xj
(∇f (x) − ATy
)j<
2f (x)
ρ, ∀j .
The first condition indicate that the Lagrange multiplier y is valid,and the second inequality implies that the complementaritycondition is approximately satisfied when ρ chosen sufficientlylarge.
Yinyu Ye ICCOPT 2013
Complexity Analyses IV
In particular, if we choose ρ ≥ 2f (x0)ε , then
‖X (∇f (x)− ATy)‖∞ ≤ ε,
which implies that x is a first-order ε-KKT solution.
Yinyu Ye ICCOPT 2013
Complexity Analyses IV
In particular, if we choose ρ ≥ 2f (x0)ε , then
‖X (∇f (x)− ATy)‖∞ ≤ ε,
which implies that x is a first-order ε-KKT solution.
TheoremThe algorithm then will provably return a first-order ε-KKT
solution of LCOP in no more than O( f (x0)ε log f (x0)
ε ) iterations forany given ε < 1, if the objective function is concave.
Yinyu Ye ICCOPT 2013
More Applications and Questions
� Could the time bound be further improved for QP or concaveminimization?
Yinyu Ye ICCOPT 2013
More Applications and Questions
� Could the time bound be further improved for QP or concaveminimization?
� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?
Yinyu Ye ICCOPT 2013
More Applications and Questions
� Could the time bound be further improved for QP or concaveminimization?
� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?
� We are not even able to prove the O(ε−1 log ε−1) bound when
f (x) = q(x) + ‖x‖pp.
Yinyu Ye ICCOPT 2013
More Applications and Questions
� Could the time bound be further improved for QP or concaveminimization?
� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?
� We are not even able to prove the O(ε−1 log ε−1) bound when
f (x) = q(x) + ‖x‖pp.
� More structural properties on the final KKT solution.
Yinyu Ye ICCOPT 2013
More Applications and Questions
� Could the time bound be further improved for QP or concaveminimization?
� Would the O(ε−1 log ε−1) bound applicable to more generalnon-convex optimization problems?
� We are not even able to prove the O(ε−1 log ε−1) bound when
f (x) = q(x) + ‖x‖pp.
� More structural properties on the final KKT solution.
� Applications to general sparse solution optimization, such asthe cardinality constrained portfolio selection, sparse pricingreduction for revenue management, etc..
Yinyu Ye ICCOPT 2013