Stochastic optimization for power-aware distributed scheduling

68
hastic optimization for power-aw distributed scheduling Michael J. Neely University of Southern California http://www-bcf.usc.edu/~mjneely t ω(t)

description

Stochastic optimization for power-aware distributed scheduling. ω (t). t. Michael J. Neely University of Southern California http://www- bcf.usc.edu /~ mjneely. Outline. Lyapunov optimization method Power-aware wireless transmission Basic problem Cache-aware peering - PowerPoint PPT Presentation

Transcript of Stochastic optimization for power-aware distributed scheduling

Page 1: Stochastic optimization for power-aware  distributed scheduling

Stochastic optimization for power-aware distributed scheduling

Michael J. NeelyUniversity of Southern California

http://www-bcf.usc.edu/~mjneely

tω(t)

Page 2: Stochastic optimization for power-aware  distributed scheduling

Outline

• Lyapunov optimization method• Power-aware wireless transmission– Basic problem– Cache-aware peering– Quality-aware video streaming

• Distributed sensor reporting and correlated scheduling

Page 3: Stochastic optimization for power-aware  distributed scheduling

A single wireless device

R(t) = r(P(t), ω(t))

Timeslots t = {0, 1, 2, …} ω(t) = Random channel state on slot tP(t) = Power used on slot tR(t) = Transmission rate on slot t (function of P(t), ω(t))

observedchosen

Page 4: Stochastic optimization for power-aware  distributed scheduling

Example

R(t) = log(1 + P(t)ω(t))

observedchosen

t

ω(t)

tR(t)

Page 5: Stochastic optimization for power-aware  distributed scheduling

Example

R(t) = log(1 + P(t)ω(t))

observedchosen

t

ω(t)

tR(t)

Page 6: Stochastic optimization for power-aware  distributed scheduling

Example

R(t) = log(1 + P(t)ω(t))

observedchosen

t

ω(t)

tR(t)

Page 7: Stochastic optimization for power-aware  distributed scheduling

Optimization problem

Maximize: R

Subject to: P ≤ c

Given: • Pr[ω(t)=ω] = π(ω) , ω in {ω1, ω2, …, ω1000}

• p(t) in P = {p1, p2, …, p5}

• c = desired power constraint

Page 8: Stochastic optimization for power-aware  distributed scheduling

Consider randomized decisions

Pr[pk | ωi ] = Pr[P(t) = pk | ω(t)=ωi]

• ω(t) in {ω1, ω2, …, ω1000}

• P(t) in P = {p1, p2, …, p5}

∑Pr[pk | ωi ] = 1

( for all ωi in {ω1, ω2, …, ω1000} )

k=1

5

Page 9: Stochastic optimization for power-aware  distributed scheduling

Linear programming approach

Max: R

S.t. : P ≤ c

Given parameters: π(ωi) (1000 probabilities) r(pk , ωi) (5*1000 coefficients)

Optimization variables: Pr[pk|ωi] (5*1000 variables)

∑ ∑ π(ωi) Pr[pk|ωi] r(pk,ωi) 1000

i=1 k=1

5

∑ ∑ π(ωi) Pr[pk|ωi] pk ≤ c 1000 5

i=1 k=1

Max:

S.t.:

Page 10: Stochastic optimization for power-aware  distributed scheduling

Multi-dimensional problem

1AccessPoint

2

N

• Observe (ω1(t), …, ωN(t))• Decisions:

-- Choose which user to serve-- Choose which power to use

R1(t)

R2(t)RN(t)

Page 11: Stochastic optimization for power-aware  distributed scheduling

Goal and LP approach

Maximize: R1 + R2 + … + RN

Subject to: Pn ≤ c for all n in {1, …, N}

LP has given parameters: π(ω1, …, ωN) (1000N probabilities) rn(pk , ωi) (N*5N*1000N coefficients)

LP has optimization variables: Pr[pk|ωi] (5N*1000N variables)

Page 12: Stochastic optimization for power-aware  distributed scheduling

Advantages of LP approach• Solves the problem of interest

• LPs have been around for a long time

• Many people are comfortable with LPs

Page 13: Stochastic optimization for power-aware  distributed scheduling

Disadvantages of LP approach

Page 14: Stochastic optimization for power-aware  distributed scheduling

Disadvantages of LP approach• Need to estimate an exponential

number of probabilities.

• LP has exponential number of variables.

• What if probabilities change?

• Fairness?

• Delay?

• Channel errors?

Page 15: Stochastic optimization for power-aware  distributed scheduling

Lyapunov optimization approach

Maximize: R1 + R2 + … + RN

Subject to: Pn ≤ c for all n in {1, …, N}

Page 16: Stochastic optimization for power-aware  distributed scheduling

Lyapunov optimization approach

Maximize: R1 + R2 + … + RN

Subject to: Pn ≤ c for all n in {1, …, N}

Virtual queue for each constraint:

Stabilizing virtual queue constraint satisfied!

Qn(t+1) = max[Qn(t) + Pn(t) – c, 0]

Qn(t)Pn(t) c

Page 17: Stochastic optimization for power-aware  distributed scheduling

Lyapunov drift

L(t) = ½ ∑ Qn(t)2

Δ(t) = L(t+1) – L(t)

n

Q1

Q2

Page 18: Stochastic optimization for power-aware  distributed scheduling

Drift-plus-penalty algorithmEvery slot t: • Observe (Q1(t), …., QN(t)), (ω1(t), …, ωN(t))

• Choose (P1(t), …, PN(t)) to greedily minimize:

• Update queues.

Δ(t) - (1/ε)(R1(t) + … + RN(t)) drift penalty

Low complexityNo knowledge of π(ω) probabilities is required

Page 19: Stochastic optimization for power-aware  distributed scheduling

Specific DPP implementation• Each user n observes ωn(t), Qn(t). • Each user n chooses Pn(t) in P to minimize:

-(1/ε)rn(Pn(t), ωn(t)) + Qn(t)Pn(t)

• Choose user n* with smallest such value.• User n* transmits with power level Pn*(t).

Low complexityNo knowledge of π(ω) probabilities is required

Page 20: Stochastic optimization for power-aware  distributed scheduling

Performance TheoremAssume it is possible to satisfy the constraints. Then under DPP with any ε>0: • All power constraints are satisfied.• Average thruput satisfies:

• Average queue size satisfies:

∑ Qn ≤ O(1/ε)

R1 + … + RN ≥ throughputopt – O(ε)

Page 21: Stochastic optimization for power-aware  distributed scheduling

General SNO problem

Minimize: y0(α(t), ω(t))

Subject to: yn(α(t), ω(t)) ≤ 0 for all n in {1, …, N}

α(t) in Aω(t) for all t in {0, 1, 2, …} Such problems are solved by the DPP algorithm. Performance theorem: O(ε), O(1/ε) tradeoff.

ω(t) = Observed random event on slot t π(ω) = Pr[ω(t)=ω] (possibly unknown)α(t) = Control action on slot tAω(t) = Abstract set of action options

Page 22: Stochastic optimization for power-aware  distributed scheduling

What we have done so far

• Lyapunov optimization method• Power-aware wireless transmission– Basic problem– Cache-aware peering– Quality-aware video streaming

• Distributed sensor reporting and correlated scheduling

Page 23: Stochastic optimization for power-aware  distributed scheduling

What we have done so far

• Lyapunov optimization method• Power-aware wireless transmission– Basic problem– Cache-aware peering– Quality-aware video streaming

• Distributed sensor reporting and correlated scheduling

Page 24: Stochastic optimization for power-aware  distributed scheduling

What we have done so far

• Lyapunov optimization method• Power-aware wireless transmission– Basic problem– Cache-aware peering– Quality-aware video streaming

• Distributed sensor reporting and correlated scheduling

Page 25: Stochastic optimization for power-aware  distributed scheduling

What we have done so far

• Lyapunov optimization method• Power-aware wireless transmission– Basic problem– Cache-aware peering– Quality-aware video streaming

• Distributed sensor reporting and correlated scheduling

Page 26: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

Page 27: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

Page 28: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

AccessPoint

Page 29: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

AccessPoint

Page 30: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

AccessPoint

Page 31: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

AccessPoint

AccessPoint

Page 32: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

AccessPoint

AccessPoint

AccessPoint

Page 33: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

AccessPoint

AccessPoint

AccessPoint

Page 34: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

AccessPoint

AccessPoint

AccessPoint

Page 35: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

AccessPoint

AccessPoint

AccessPoint

Page 36: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

AccessPoint

AccessPoint

AccessPoint

Page 37: Stochastic optimization for power-aware  distributed scheduling

Mobile P2P video downloads

AccessPoint

AccessPoint

AccessPoint

Page 38: Stochastic optimization for power-aware  distributed scheduling

Cache-aware scheduling• Access points (including “femto” nodes)• Typically stationary• Typically have many files cached

• Users• Typically mobile• Typically have fewer files cached• Assume each user wants one “long” file• Can opportunistically grab packets from

any nearby user or access point that has the file.

Page 39: Stochastic optimization for power-aware  distributed scheduling

Quality-aware video delivery

Video chunks as time progresses

Quality Layer 1

Quality Layer 2

Bits: 8176D: 11.045

Bits: 7370D: 10.777

Quality Layer L

Bits: 40968D: 0

Bits: 58152D: 7.363

Bits: 120776D: 7.108

Bits: 97864D: 6.971

Bits: 41304D: 6.716

Bits: 277256D: 0

Bits: 419640D: 0

Bits: 72800D: 6.261

Bits: 59984D: 6.129

Bits: 299216D: 0

• D = Distortion.• Results hold for any matrices Bits(layer, chunk), D(layer, chunk).• Bits are queued for wireless transmission.

Page 40: Stochastic optimization for power-aware  distributed scheduling

Fair video quality deliveryMinimize: f( D1 ) + f( D2 ) + … + f( DN )

Subject to: Pn ≤ c for all n in {1, …, N}

Video playback rate constraints

Page 41: Stochastic optimization for power-aware  distributed scheduling

Fair video quality deliveryMinimize: f( D1 ) + f( D2 ) + … + f( DN )

Subject to: Pn ≤ c for all n in {1, …, N}

Video playback rate constraintsRecall the general form:

Min: y0

S.t. : yn ≤ 0 for all n α(t) in Aω(t) for all t

Page 42: Stochastic optimization for power-aware  distributed scheduling

Fair video quality delivery

Min: y0

S.t. : yn ≤ 0 for all n α(t) in Aω(t) for all t

Minimize: f( D1 ) + f( D2 ) + … + f( DN )

Subject to: Pn ≤ c for all n in {1, …, N}

Video playback rate constraintsRecall the general form:

Define Yn(t) = Pn(t) - c

Page 43: Stochastic optimization for power-aware  distributed scheduling

Fair video quality deliveryMinimize: f( D1 ) + f( D2 ) + … + f( DN )

Subject to: Pn ≤ c for all n in {1, …, N}

Video playback rate constraintsRecall the general form:

Define auxiliary variable γ(t) in [0, Dmax]

Min: y0

S.t. : yn ≤ 0 for all n α(t) in Aω(t) for all t

Page 44: Stochastic optimization for power-aware  distributed scheduling

Equivalence via Jensen’s inequalityMinimize: f( D1 ) + f( D2 ) + … + f( DN )

Subject to: Pn ≤ c for all n in {1, …, N}

Video playback rate constraints

Minimize: f( γ1(t)) + f( γ2(t)) + … + f( γN(t))

Subject to: Pn ≤ c for all n in {1, …, N}

γn = Dn for all n in {1, …, N} Video playback rate constraints

Page 45: Stochastic optimization for power-aware  distributed scheduling

Example simulation

BS

• Region divided into 20 x 20 subcells (only a portion shown here).• 1250 mobile devices, 1 base station • 3.125 mobiles/subcell

Page 46: Stochastic optimization for power-aware  distributed scheduling

• Phases 1, 2, 3: File availability prob = 5%, 10%, 7%• Basestation Average Traffic: 2.0 packets/slot• Peer-to-Peer Average Traffic: 153.7 packets/slot• Factor of 77.8 gain compared to BS alone!

Page 47: Stochastic optimization for power-aware  distributed scheduling

What we have done so far

• Lyapunov optimization method• Power-aware wireless transmission– Basic problem– Cache-aware peering– Quality-aware video streaming

• Distributed sensor reporting and correlated scheduling

Page 48: Stochastic optimization for power-aware  distributed scheduling

Distributed sensor reports

• ωi(t) = 0/1 if sensor i observes the event on slot t• Pi(t) = 0/1 if sensor i reports on slot t• Utility: U(t) = min[P1(t)ω1(t) + (1/2)P2(t)ω2(t),1]

1

2Fusion Center

Maximize: U

Subject to: P1 ≤ c

P2 ≤ c

ω1(t)

ω2(t)

Page 49: Stochastic optimization for power-aware  distributed scheduling

What is optimal?

Agreeon plan 0 1 2 3

t4

Page 50: Stochastic optimization for power-aware  distributed scheduling

What is optimal?

Agreeon plan 0 1 2 3

t4

Example plan: User 1: • t=even Do not report.• t=odd Report if ω1(t)=1.User 2: • t=even Report if ω2(t)=1 • t=odd: Report with prob ½ if ω2(t)=1

Page 51: Stochastic optimization for power-aware  distributed scheduling

Common source of randomness

Example: 1 slot = 1 dayEach user looks at Boston Globe every day:• If first letter is a “T” Plan 1• If first letter is an “S” Plan 2• Etc.

Day 1 Day 2

Page 52: Stochastic optimization for power-aware  distributed scheduling

Specific exampleAssume:• Pr[ω1(t)=1] = ¾, Pr[ω2(t)=1] = ½• ω1(t), ω2(t) independent• Power constraint c = 1/3

Approach 1: Independent reporting• If ω1(t)=1, user 1 reports with probability θ1

• If ω2(t)=1, user 2 reports with probability θ2

Optimizing θ1, θ2 gives u = 4/9 ≈ 0.44444

Page 53: Stochastic optimization for power-aware  distributed scheduling

Approach 2: Correlated reportingPure strategy 1: • User 1 reports if and only if ω1(t)=1.• User 2 does not report.

Pure strategy 2: • User 1 does not report.• User 2 reports if and only if ω2(t)=1.

Pure strategy 3:• User 1 reports if and only if ω1(t)=1.• User 2 reports if and only if ω2(t)=1.

Page 54: Stochastic optimization for power-aware  distributed scheduling

Approach 2: Correlated reportingX(t) = iid random variable (commonly known): • Pr[X(t)=1] = θ1

• Pr[X(t)=2] = θ2

• Pr[X(t)=3] = θ3

On slot t: • Users observe X(t)• If X(t)=k, users use pure strategy k.

Optimizing θ1, θ2, θ3 gives u = 23/48 ≈ 0.47917

Page 55: Stochastic optimization for power-aware  distributed scheduling

Summary of approaches

Independent reporting

Correlated reporting

Centralized reporting

0.47917

0.44444

0.5

Strategy u

Page 56: Stochastic optimization for power-aware  distributed scheduling

Summary of approaches

Independent reporting

Correlated reporting

Centralized reporting

0.47917

0.44444

0.5

Strategy u

It can be shown that this is optimal over all distributed strategies!

Page 57: Stochastic optimization for power-aware  distributed scheduling

General distributed optimization

Maximize: U

Subject to: Pk ≤ 0 for k in {1, …, K}

ω(t) = (ω1(t), …, ωΝ(t))π(ω) = Pr[ω(t) = (ω1, …, ωΝ)]α(t) = (α1(t), …, αΝ(t))U(t) = u(α(t), ω(t))Pk(t) = pk(α(t), ω(t))

Page 58: Stochastic optimization for power-aware  distributed scheduling

Pure strategies

A pure strategy is a deterministic vector-valued function:

g(ω) = (g1(ω1), g2(ω2), …, gΝ (ωΝ))

Let M = # pure strategies:

M = |A1||Ω1| x |A2||Ω2| x ... x |AN||ΩN|

Page 59: Stochastic optimization for power-aware  distributed scheduling

Optimality Theorem

There exist:• K+1 pure strategies g(m)(ω)• Probabilities θ1, θ2, …, θK+1

such that the following distributed algorithm is optimal:

X(t) = iid, Pr[X(t)=m] = θm

• Each user observes X(t)• If X(t)=m use strategy g(m)(ω).

Page 60: Stochastic optimization for power-aware  distributed scheduling

LP and complexity reduction• The probabilities can be found by an LP

• Unfortunately, the LP has M variables

• If (ω1(t), …, ωΝ(t)) are mutually independent and the utility function satisfies a preferred action property, complexity can be reduced

• Example N=2 users, |A1|=|A2|=2

--Old complexity = 2|Ω1|+|Ω2|

--New complexity = (|Ω1|+1)(|Ω2|+1)

Page 61: Stochastic optimization for power-aware  distributed scheduling

Lyapunov optimization approach• Define K virtual queues Q1(t), …, QK(t).

• Every slot t, observe queues and choose strategy m in {1, …, M} to maximize a weighted sum of queues.

• Update queues with delayed feedback:

Qk(t+1) = max[Qk(t) + Pk(t-D), 0]

Page 62: Stochastic optimization for power-aware  distributed scheduling

Separable problemsIf the utility and penalty functions are a separable sum of functions of individual variables (αn(t), ωn(t)), then:

• There is no optimality gap between centralized and distributed algorithms

• Problem complexity reduces from exponential to linear.

Page 63: Stochastic optimization for power-aware  distributed scheduling

Simulation (non-separable problem)

• 3-user problem• αn(t) in {0, 1} for n ={1, 2, 3}. • ωn(t) in {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}• V=1/ε• Get O(ε) guarantee to optimality• Convergence time depends on 1/ε

Page 64: Stochastic optimization for power-aware  distributed scheduling

Utility versus V parameter (V=1/ε)U

tility

V (recall V = 1/ε)

Page 65: Stochastic optimization for power-aware  distributed scheduling

Average power versus timeAv

erag

e po

wer

up

to ti

me

t

Time t

power constraint 1/3V=10

V=50

V=100

Page 66: Stochastic optimization for power-aware  distributed scheduling

Adaptation to non-ergodic changes

Page 67: Stochastic optimization for power-aware  distributed scheduling

Conclusions

• Drift-plus-penalty is a strong technique for general stochastic network optimization

• Power-aware scheduling

• Cache-aware scheduling

• Quality-aware video streaming

• Correlated scheduling for distributed stochastic optimization

Page 68: Stochastic optimization for power-aware  distributed scheduling

Conclusions

• Drift-plus-penalty is a strong technique for general stochastic network optimization

• Power-aware scheduling

• Cache-aware scheduling

• Quality-aware video streaming

• Correlated scheduling for distributed stochastic optimization