Stochastic optimization for power-aware distributed scheduling

Stochastic optimization for power-aware distributed scheduling

Michael J. NeelyUniversity of Southern California

http://www-bcf.usc.edu/~mjneely

tω(t)

Outline

• Lyapunov optimization method• Power-aware wireless transmission– Basic problem– Cache-aware peering– Quality-aware video streaming

• Distributed sensor reporting and correlated scheduling

A single wireless device

R(t) = r(P(t), ω(t))

Timeslots t = {0, 1, 2, …} ω(t) = Random channel state on slot tP(t) = Power used on slot tR(t) = Transmission rate on slot t (function of P(t), ω(t))

observedchosen

Example

R(t) = log(1 + P(t)ω(t))

observedchosen

t

ω(t)

tR(t)

Optimization problem

Maximize: R

Subject to: P ≤ c

Given: • Pr[ω(t)=ω] = π(ω) , ω in {ω1, ω2, …, ω1000}

• p(t) in P = {p1, p2, …, p5}

• c = desired power constraint

Consider randomized decisions

Pr[pk | ωi ] = Pr[P(t) = pk | ω(t)=ωi]

• ω(t) in {ω1, ω2, …, ω1000}

• P(t) in P = {p1, p2, …, p5}

∑Pr[pk | ωi ] = 1

( for all ωi in {ω1, ω2, …, ω1000} )

k=1

5

Linear programming approach

Max: R

S.t. : P ≤ c

Given parameters: π(ωi) (1000 probabilities) r(pk , ωi) (5*1000 coefficients)

Optimization variables: Pr[pk|ωi] (5*1000 variables)

∑ ∑ π(ωi) Pr[pk|ωi] r(pk,ωi) 1000

i=1 k=1

5

∑ ∑ π(ωi) Pr[pk|ωi] pk ≤ c 1000 5

i=1 k=1

Max:

S.t.:

Multi-dimensional problem

1AccessPoint

2

N

• Observe (ω1(t), …, ωN(t))• Decisions:

-- Choose which user to serve-- Choose which power to use

R1(t)

R2(t)RN(t)

Goal and LP approach

Maximize: R1 + R2 + … + RN

Subject to: Pn ≤ c for all n in {1, …, N}

LP has given parameters: π(ω1, …, ωN) (1000N probabilities) rn(pk , ωi) (N*5N*1000N coefficients)

LP has optimization variables: Pr[pk|ωi] (5N*1000N variables)

Advantages of LP approach• Solves the problem of interest

• LPs have been around for a long time

• Many people are comfortable with LPs

Disadvantages of LP approach

Disadvantages of LP approach• Need to estimate an exponential

number of probabilities.

• LP has exponential number of variables.

• What if probabilities change?

• Fairness?

• Delay?

• Channel errors?

Lyapunov optimization approach



Lyapunov optimization approach



Virtual queue for each constraint:

Stabilizing virtual queue constraint satisfied!

Qn(t+1) = max[Qn(t) + Pn(t) – c, 0]

Qn(t)Pn(t) c

Lyapunov drift

L(t) = ½ ∑ Qn(t)2

Δ(t) = L(t+1) – L(t)

n

Q1

Q2

Drift-plus-penalty algorithmEvery slot t: • Observe (Q1(t), …., QN(t)), (ω1(t), …, ωN(t))

• Choose (P1(t), …, PN(t)) to greedily minimize:

• Update queues.

Δ(t) - (1/ε)(R1(t) + … + RN(t)) drift penalty

Low complexityNo knowledge of π(ω) probabilities is required

Specific DPP implementation• Each user n observes ωn(t), Qn(t). • Each user n chooses Pn(t) in P to minimize:

-(1/ε)rn(Pn(t), ωn(t)) + Qn(t)Pn(t)

• Choose user n* with smallest such value.• User n* transmits with power level Pn*(t).

Low complexityNo knowledge of π(ω) probabilities is required

Performance TheoremAssume it is possible to satisfy the constraints. Then under DPP with any ε>0: • All power constraints are satisfied.• Average thruput satisfies:

• Average queue size satisfies:

∑ Qn ≤ O(1/ε)

R1 + … + RN ≥ throughputopt – O(ε)

General SNO problem

Minimize: y0(α(t), ω(t))

Subject to: yn(α(t), ω(t)) ≤ 0 for all n in {1, …, N}

α(t) in Aω(t) for all t in {0, 1, 2, …} Such problems are solved by the DPP algorithm. Performance theorem: O(ε), O(1/ε) tradeoff.

ω(t) = Observed random event on slot t π(ω) = Pr[ω(t)=ω] (possibly unknown)α(t) = Control action on slot tAω(t) = Abstract set of action options

What we have done so far



Mobile P2P video downloads


AccessPoint


AccessPoint

AccessPoint


AccessPoint

AccessPoint

AccessPoint

Cache-aware scheduling• Access points (including “femto” nodes)• Typically stationary• Typically have many files cached

• Users• Typically mobile• Typically have fewer files cached• Assume each user wants one “long” file• Can opportunistically grab packets from

any nearby user or access point that has the file.

Quality-aware video delivery

Video chunks as time progresses

Quality Layer 1

Quality Layer 2

Bits: 8176D: 11.045

Bits: 7370D: 10.777

Quality Layer L

Bits: 40968D: 0

Bits: 58152D: 7.363

Bits: 120776D: 7.108

Bits: 97864D: 6.971

Bits: 41304D: 6.716

Bits: 277256D: 0

Bits: 419640D: 0

Bits: 72800D: 6.261

Bits: 59984D: 6.129

Bits: 299216D: 0

• D = Distortion.• Results hold for any matrices Bits(layer, chunk), D(layer, chunk).• Bits are queued for wireless transmission.

Fair video quality deliveryMinimize: f( D1 ) + f( D2 ) + … + f( DN )


Video playback rate constraints



Video playback rate constraintsRecall the general form:

Min: y0

S.t. : yn ≤ 0 for all n α(t) in Aω(t) for all t

Fair video quality delivery

Min: y0


Minimize: f( D1 ) + f( D2 ) + … + f( DN )



Define Yn(t) = Pn(t) - c




Define auxiliary variable γ(t) in [0, Dmax]

Min: y0


Equivalence via Jensen’s inequalityMinimize: f( D1 ) + f( D2 ) + … + f( DN )


Video playback rate constraints

Minimize: f( γ1(t)) + f( γ2(t)) + … + f( γN(t))


γn = Dn for all n in {1, …, N} Video playback rate constraints

Example simulation

BS

• Region divided into 20 x 20 subcells (only a portion shown here).• 1250 mobile devices, 1 base station • 3.125 mobiles/subcell

• Phases 1, 2, 3: File availability prob = 5%, 10%, 7%• Basestation Average Traffic: 2.0 packets/slot• Peer-to-Peer Average Traffic: 153.7 packets/slot• Factor of 77.8 gain compared to BS alone!

What we have done so far



Distributed sensor reports

• ωi(t) = 0/1 if sensor i observes the event on slot t• Pi(t) = 0/1 if sensor i reports on slot t• Utility: U(t) = min[P1(t)ω1(t) + (1/2)P2(t)ω2(t),1]

1

2Fusion Center

Maximize: U

Subject to: P1 ≤ c

P2 ≤ c

ω1(t)

ω2(t)

What is optimal?

Agreeon plan 0 1 2 3

t4

What is optimal?

Agreeon plan 0 1 2 3

t4

Example plan: User 1: • t=even Do not report.• t=odd Report if ω1(t)=1.User 2: • t=even Report if ω2(t)=1 • t=odd: Report with prob ½ if ω2(t)=1

Common source of randomness

Example: 1 slot = 1 dayEach user looks at Boston Globe every day:• If first letter is a “T” Plan 1• If first letter is an “S” Plan 2• Etc.

Day 1 Day 2

Specific exampleAssume:• Pr[ω1(t)=1] = ¾, Pr[ω2(t)=1] = ½• ω1(t), ω2(t) independent• Power constraint c = 1/3

Approach 1: Independent reporting• If ω1(t)=1, user 1 reports with probability θ1

• If ω2(t)=1, user 2 reports with probability θ2

Optimizing θ1, θ2 gives u = 4/9 ≈ 0.44444

Approach 2: Correlated reportingPure strategy 1: • User 1 reports if and only if ω1(t)=1.• User 2 does not report.

Pure strategy 2: • User 1 does not report.• User 2 reports if and only if ω2(t)=1.

Pure strategy 3:• User 1 reports if and only if ω1(t)=1.• User 2 reports if and only if ω2(t)=1.

Approach 2: Correlated reportingX(t) = iid random variable (commonly known): • Pr[X(t)=1] = θ1

• Pr[X(t)=2] = θ2

• Pr[X(t)=3] = θ3

On slot t: • Users observe X(t)• If X(t)=k, users use pure strategy k.

Optimizing θ1, θ2, θ3 gives u = 23/48 ≈ 0.47917

Summary of approaches

Independent reporting

Correlated reporting

Centralized reporting

0.47917

0.44444

0.5

Strategy u

Summary of approaches

Independent reporting

Correlated reporting

Centralized reporting

0.47917

0.44444

0.5

Strategy u

It can be shown that this is optimal over all distributed strategies!

General distributed optimization

Maximize: U

Subject to: Pk ≤ 0 for k in {1, …, K}

ω(t) = (ω1(t), …, ωΝ(t))π(ω) = Pr[ω(t) = (ω1, …, ωΝ)]α(t) = (α1(t), …, αΝ(t))U(t) = u(α(t), ω(t))Pk(t) = pk(α(t), ω(t))

Pure strategies

A pure strategy is a deterministic vector-valued function:

g(ω) = (g1(ω1), g2(ω2), …, gΝ (ωΝ))

Let M = # pure strategies:

M = |A1||Ω1| x |A2||Ω2| x ... x |AN||ΩN|

Optimality Theorem

There exist:• K+1 pure strategies g(m)(ω)• Probabilities θ1, θ2, …, θK+1

such that the following distributed algorithm is optimal:

X(t) = iid, Pr[X(t)=m] = θm

• Each user observes X(t)• If X(t)=m use strategy g(m)(ω).

LP and complexity reduction• The probabilities can be found by an LP

• Unfortunately, the LP has M variables

• If (ω1(t), …, ωΝ(t)) are mutually independent and the utility function satisfies a preferred action property, complexity can be reduced

• Example N=2 users, |A1|=|A2|=2

--Old complexity = 2|Ω1|+|Ω2|

--New complexity = (|Ω1|+1)(|Ω2|+1)

Lyapunov optimization approach• Define K virtual queues Q1(t), …, QK(t).

• Every slot t, observe queues and choose strategy m in {1, …, M} to maximize a weighted sum of queues.

• Update queues with delayed feedback:

Qk(t+1) = max[Qk(t) + Pk(t-D), 0]

Separable problemsIf the utility and penalty functions are a separable sum of functions of individual variables (αn(t), ωn(t)), then:

• There is no optimality gap between centralized and distributed algorithms

• Problem complexity reduces from exponential to linear.

Simulation (non-separable problem)

• 3-user problem• αn(t) in {0, 1} for n ={1, 2, 3}. • ωn(t) in {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}• V=1/ε• Get O(ε) guarantee to optimality• Convergence time depends on 1/ε

Utility versus V parameter (V=1/ε)U

tility

V (recall V = 1/ε)

Average power versus timeAv

erag

e po

wer

up

to ti

me

t

Time t

power constraint 1/3V=10

V=50

V=100

Adaptation to non-ergodic changes

Conclusions

• Drift-plus-penalty is a strong technique for general stochastic network optimization

• Power-aware scheduling

• Cache-aware scheduling

• Quality-aware video streaming

• Correlated scheduling for distributed stochastic optimization

Stochastic optimization for power-aware distributed scheduling

Documents

Transcript of Stochastic optimization for power-aware distributed scheduling