Congestion control - UMass Amherstarun/653/lectures/L6.pdf · Causes/costs of congestion: scenario...

87
Congestion control Lecture 6 CS 653

Transcript of Congestion control - UMass Amherstarun/653/lectures/L6.pdf · Causes/costs of congestion: scenario...

Congestion control

Lecture 6 CS 653

Why congestion control

Causescosts of congestion scenario 1

  two senders two receivers

  one router infinite buffers

  no retransmission

  large delays when congested

  throughput staurates

unlimited shared output link buffers

Host A λin original data

Host B

λout

Causescosts of congestion scenario 2

  one router finite buffers   sender retransmission of lost

packet

finite shared output link buffers

Host A λin original data

Host B

λout

λin original data plus retransmitted data

Causescosts of congestion scenario 2   always (goodput)   ldquoperfectrdquo retransmission when only loss

  retransmission of delayed (not lost) packet makes larger (than perfect case) for same

λ13in

λ13out = λ13

in λ13out gt

λ13in

λ13out

ldquocostsrdquo of congestion   more work (retransmission) for given ldquogoodputrdquo   unneeded retransmissions link carries multiple copies of pkt

R2

R2 λin

λ out

b

R2

R2 λin

λ out

a

R2

R2 λin

λ out

c

R4

R3

Causescosts of congestion scenario 3   four senders   multihop paths   timeoutretransmit

λ13in

Q what happens as and increase λ13

in

finite shared output link buffers

Host A λin original data

Host B

λout

λin original data plus retransmitted data

Causescosts of congestion scenario 3

Another ldquocostrdquo of congestion   when packet dropped any ldquoupstream

transmission capacity used for that packet was wasted

Host A

Host B

λout

Two broad approaches towards congestion control

End-end congestion control

  no explicit feedback from network

  congestion inferred from end-system observed loss delay

  approach taken by TCP

Network-assisted congestion control

  routers provide feedback to endhosts   single bit indicating

congestion (SNA DECbit ATM TCPIP ECN)

  explicit rate sender should send

  recent proposals [XCP] [RCP] revisit ATM ideas

TCP congestion control

Components of TCP congestion control

  Slow start  Multiplicatively increase (double) window

  Congestion avoidance  Additively increase (by 1 MSS) window

  Loss  Multiplicatively decrease (halve) window

  Timeout  Set cwnd to 1 MSS  Multiplicatively increase (double) retransmission

timeout upon each further consecutive loss

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Why congestion control

Causescosts of congestion scenario 1

  two senders two receivers

  one router infinite buffers

  no retransmission

  large delays when congested

  throughput staurates

unlimited shared output link buffers

Host A λin original data

Host B

λout

Causescosts of congestion scenario 2

  one router finite buffers   sender retransmission of lost

packet

finite shared output link buffers

Host A λin original data

Host B

λout

λin original data plus retransmitted data

Causescosts of congestion scenario 2   always (goodput)   ldquoperfectrdquo retransmission when only loss

  retransmission of delayed (not lost) packet makes larger (than perfect case) for same

λ13in

λ13out = λ13

in λ13out gt

λ13in

λ13out

ldquocostsrdquo of congestion   more work (retransmission) for given ldquogoodputrdquo   unneeded retransmissions link carries multiple copies of pkt

R2

R2 λin

λ out

b

R2

R2 λin

λ out

a

R2

R2 λin

λ out

c

R4

R3

Causescosts of congestion scenario 3   four senders   multihop paths   timeoutretransmit

λ13in

Q what happens as and increase λ13

in

finite shared output link buffers

Host A λin original data

Host B

λout

λin original data plus retransmitted data

Causescosts of congestion scenario 3

Another ldquocostrdquo of congestion   when packet dropped any ldquoupstream

transmission capacity used for that packet was wasted

Host A

Host B

λout

Two broad approaches towards congestion control

End-end congestion control

  no explicit feedback from network

  congestion inferred from end-system observed loss delay

  approach taken by TCP

Network-assisted congestion control

  routers provide feedback to endhosts   single bit indicating

congestion (SNA DECbit ATM TCPIP ECN)

  explicit rate sender should send

  recent proposals [XCP] [RCP] revisit ATM ideas

TCP congestion control

Components of TCP congestion control

  Slow start  Multiplicatively increase (double) window

  Congestion avoidance  Additively increase (by 1 MSS) window

  Loss  Multiplicatively decrease (halve) window

  Timeout  Set cwnd to 1 MSS  Multiplicatively increase (double) retransmission

timeout upon each further consecutive loss

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Causescosts of congestion scenario 1

  two senders two receivers

  one router infinite buffers

  no retransmission

  large delays when congested

  throughput staurates

unlimited shared output link buffers

Host A λin original data

Host B

λout

Causescosts of congestion scenario 2

  one router finite buffers   sender retransmission of lost

packet

finite shared output link buffers

Host A λin original data

Host B

λout

λin original data plus retransmitted data

Causescosts of congestion scenario 2   always (goodput)   ldquoperfectrdquo retransmission when only loss

  retransmission of delayed (not lost) packet makes larger (than perfect case) for same

λ13in

λ13out = λ13

in λ13out gt

λ13in

λ13out

ldquocostsrdquo of congestion   more work (retransmission) for given ldquogoodputrdquo   unneeded retransmissions link carries multiple copies of pkt

R2

R2 λin

λ out

b

R2

R2 λin

λ out

a

R2

R2 λin

λ out

c

R4

R3

Causescosts of congestion scenario 3   four senders   multihop paths   timeoutretransmit

λ13in

Q what happens as and increase λ13

in

finite shared output link buffers

Host A λin original data

Host B

λout

λin original data plus retransmitted data

Causescosts of congestion scenario 3

Another ldquocostrdquo of congestion   when packet dropped any ldquoupstream

transmission capacity used for that packet was wasted

Host A

Host B

λout

Two broad approaches towards congestion control

End-end congestion control

  no explicit feedback from network

  congestion inferred from end-system observed loss delay

  approach taken by TCP

Network-assisted congestion control

  routers provide feedback to endhosts   single bit indicating

congestion (SNA DECbit ATM TCPIP ECN)

  explicit rate sender should send

  recent proposals [XCP] [RCP] revisit ATM ideas

TCP congestion control

Components of TCP congestion control

  Slow start  Multiplicatively increase (double) window

  Congestion avoidance  Additively increase (by 1 MSS) window

  Loss  Multiplicatively decrease (halve) window

  Timeout  Set cwnd to 1 MSS  Multiplicatively increase (double) retransmission

timeout upon each further consecutive loss

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Causescosts of congestion scenario 2

  one router finite buffers   sender retransmission of lost

packet

finite shared output link buffers

Host A λin original data

Host B

λout

λin original data plus retransmitted data

Causescosts of congestion scenario 2   always (goodput)   ldquoperfectrdquo retransmission when only loss

  retransmission of delayed (not lost) packet makes larger (than perfect case) for same

λ13in

λ13out = λ13

in λ13out gt

λ13in

λ13out

ldquocostsrdquo of congestion   more work (retransmission) for given ldquogoodputrdquo   unneeded retransmissions link carries multiple copies of pkt

R2

R2 λin

λ out

b

R2

R2 λin

λ out

a

R2

R2 λin

λ out

c

R4

R3

Causescosts of congestion scenario 3   four senders   multihop paths   timeoutretransmit

λ13in

Q what happens as and increase λ13

in

finite shared output link buffers

Host A λin original data

Host B

λout

λin original data plus retransmitted data

Causescosts of congestion scenario 3

Another ldquocostrdquo of congestion   when packet dropped any ldquoupstream

transmission capacity used for that packet was wasted

Host A

Host B

λout

Two broad approaches towards congestion control

End-end congestion control

  no explicit feedback from network

  congestion inferred from end-system observed loss delay

  approach taken by TCP

Network-assisted congestion control

  routers provide feedback to endhosts   single bit indicating

congestion (SNA DECbit ATM TCPIP ECN)

  explicit rate sender should send

  recent proposals [XCP] [RCP] revisit ATM ideas

TCP congestion control

Components of TCP congestion control

  Slow start  Multiplicatively increase (double) window

  Congestion avoidance  Additively increase (by 1 MSS) window

  Loss  Multiplicatively decrease (halve) window

  Timeout  Set cwnd to 1 MSS  Multiplicatively increase (double) retransmission

timeout upon each further consecutive loss

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Causescosts of congestion scenario 2   always (goodput)   ldquoperfectrdquo retransmission when only loss

  retransmission of delayed (not lost) packet makes larger (than perfect case) for same

λ13in

λ13out = λ13

in λ13out gt

λ13in

λ13out

ldquocostsrdquo of congestion   more work (retransmission) for given ldquogoodputrdquo   unneeded retransmissions link carries multiple copies of pkt

R2

R2 λin

λ out

b

R2

R2 λin

λ out

a

R2

R2 λin

λ out

c

R4

R3

Causescosts of congestion scenario 3   four senders   multihop paths   timeoutretransmit

λ13in

Q what happens as and increase λ13

in

finite shared output link buffers

Host A λin original data

Host B

λout

λin original data plus retransmitted data

Causescosts of congestion scenario 3

Another ldquocostrdquo of congestion   when packet dropped any ldquoupstream

transmission capacity used for that packet was wasted

Host A

Host B

λout

Two broad approaches towards congestion control

End-end congestion control

  no explicit feedback from network

  congestion inferred from end-system observed loss delay

  approach taken by TCP

Network-assisted congestion control

  routers provide feedback to endhosts   single bit indicating

congestion (SNA DECbit ATM TCPIP ECN)

  explicit rate sender should send

  recent proposals [XCP] [RCP] revisit ATM ideas

TCP congestion control

Components of TCP congestion control

  Slow start  Multiplicatively increase (double) window

  Congestion avoidance  Additively increase (by 1 MSS) window

  Loss  Multiplicatively decrease (halve) window

  Timeout  Set cwnd to 1 MSS  Multiplicatively increase (double) retransmission

timeout upon each further consecutive loss

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Causescosts of congestion scenario 3   four senders   multihop paths   timeoutretransmit

λ13in

Q what happens as and increase λ13

in

finite shared output link buffers

Host A λin original data

Host B

λout

λin original data plus retransmitted data

Causescosts of congestion scenario 3

Another ldquocostrdquo of congestion   when packet dropped any ldquoupstream

transmission capacity used for that packet was wasted

Host A

Host B

λout

Two broad approaches towards congestion control

End-end congestion control

  no explicit feedback from network

  congestion inferred from end-system observed loss delay

  approach taken by TCP

Network-assisted congestion control

  routers provide feedback to endhosts   single bit indicating

congestion (SNA DECbit ATM TCPIP ECN)

  explicit rate sender should send

  recent proposals [XCP] [RCP] revisit ATM ideas

TCP congestion control

Components of TCP congestion control

  Slow start  Multiplicatively increase (double) window

  Congestion avoidance  Additively increase (by 1 MSS) window

  Loss  Multiplicatively decrease (halve) window

  Timeout  Set cwnd to 1 MSS  Multiplicatively increase (double) retransmission

timeout upon each further consecutive loss

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Causescosts of congestion scenario 3

Another ldquocostrdquo of congestion   when packet dropped any ldquoupstream

transmission capacity used for that packet was wasted

Host A

Host B

λout

Two broad approaches towards congestion control

End-end congestion control

  no explicit feedback from network

  congestion inferred from end-system observed loss delay

  approach taken by TCP

Network-assisted congestion control

  routers provide feedback to endhosts   single bit indicating

congestion (SNA DECbit ATM TCPIP ECN)

  explicit rate sender should send

  recent proposals [XCP] [RCP] revisit ATM ideas

TCP congestion control

Components of TCP congestion control

  Slow start  Multiplicatively increase (double) window

  Congestion avoidance  Additively increase (by 1 MSS) window

  Loss  Multiplicatively decrease (halve) window

  Timeout  Set cwnd to 1 MSS  Multiplicatively increase (double) retransmission

timeout upon each further consecutive loss

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Two broad approaches towards congestion control

End-end congestion control

  no explicit feedback from network

  congestion inferred from end-system observed loss delay

  approach taken by TCP

Network-assisted congestion control

  routers provide feedback to endhosts   single bit indicating

congestion (SNA DECbit ATM TCPIP ECN)

  explicit rate sender should send

  recent proposals [XCP] [RCP] revisit ATM ideas

TCP congestion control

Components of TCP congestion control

  Slow start  Multiplicatively increase (double) window

  Congestion avoidance  Additively increase (by 1 MSS) window

  Loss  Multiplicatively decrease (halve) window

  Timeout  Set cwnd to 1 MSS  Multiplicatively increase (double) retransmission

timeout upon each further consecutive loss

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP congestion control

Components of TCP congestion control

  Slow start  Multiplicatively increase (double) window

  Congestion avoidance  Additively increase (by 1 MSS) window

  Loss  Multiplicatively decrease (halve) window

  Timeout  Set cwnd to 1 MSS  Multiplicatively increase (double) retransmission

timeout upon each further consecutive loss

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Components of TCP congestion control

  Slow start  Multiplicatively increase (double) window

  Congestion avoidance  Additively increase (by 1 MSS) window

  Loss  Multiplicatively decrease (halve) window

  Timeout  Set cwnd to 1 MSS  Multiplicatively increase (double) retransmission

timeout upon each further consecutive loss

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Retransmission timeout estimation

  Calculate EstimatedRTT using moving average

  Calculate deviation wrt moving average

  Timeout = EstimatedRTT + 4DevRTT

EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi

DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP Throughput

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP throughput A very very simple model

  Whatrsquos the average throughout of TCP as a function of window size and RTT T  Ignore slow start  Let W be the window size when loss occurs

  When window is W throughput is WT   Just after loss window drops to W2

throughput to W2T   Average throughput 3W4T

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP throughput A very simple model

  But what is W when loss occurs

    When window is w and queue has q packets TCP is

sending at rate w(T+qC)   For maintaining utilization and steady state

 Just before loss rate = W(T+QC) = C  Just after loss rate = W2T = C   For Q = CT (a common thumbrule to set router buffer

sizes) a loss occurs every frac14 (34W)Q = 3W28 packets

Q = queue capacity in number of packets

C = link capacity in packetssec

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

sum=

+=++⎟⎠

⎞⎜⎝

⎛ ++2

0)

2(1

22

W

nnWWWW

sum=

+⎟⎠

⎞⎜⎝

⎛ +=2

021

2

W

nnWW

2)12(2

21

2+

+⎟⎠

⎞⎜⎝

⎛ +=WWWW

WW43

83 2 +=

packets sent per ldquoperiodrdquo =

2

83Wasymp

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Deriving TCP throughputloss relationship

TCP window

size

time (rtt)

W2

W

period

packets sent per ldquoperiodrdquo 2

83Wasymp

1 packet lost per ldquoperiodrdquo implies ploss 23

8W

asymp or lossp

W38

=

rttpackets

43utavg_thrup WB ==

rttpackets221utavg_thrup

losspB ==

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Alternate fluid model

  Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p

  In steady state

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]

  With many flows loss rate and delay are not affected much by a single TCP flow  TCP behavior completely specified by loss

and delay pattern along path (bounded by bottleneck capacity)

  Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

What is PFTK modeling

  Independent loss probability p across rounds  Loss acute triple duplicate acks  Bursty loss in a round if some packet lost

all following packets in that round also lost   Timeout if lt three duplicate acks received

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

PFTK empirical validation Low loss

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

PFTK empirical validation High loss

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Loss-based TCP

  Evolution of loss-based TCP  Tahoe (without fast retransmit)  Reno (triple duplicate acks + fast

retransmit)  NewReno (Reno + handling multiple losses

better)  SACK (selective acknowledgment) common

today   Q what if loss not due to congestion

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Delay-based TCP Vegas

  Uses delay as a signal of congestion  Idea try to keep a small constant number of

packets at bottleneck queue  Expected = WBaseRTT  Actual = WCurRTT  Diff = Expected - Actual  Try to keep Diff between fixed 1 and 3

  More recent FAST TCP based on Vegas  Delay-based TCP not widely used today

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP-Friendliness

  Can we try MyFavNew TCP  Well is it TCP-friendly

  Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP

  To co-exist with TCP it must impose the same long-term load on the network  No greater long-term throughput as a function of

packet loss and delay so TCP doesnt suffer  Not significantly less long-term throughput or its

not too useful

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP friendly rate control (TFRC)

Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm

 If transmission rate is higher than that given by the model reduce the transmission rate to the models rate

 Otherwise increase the transmission rate  Eg DCCP (Datagram Congestion Control

Protocol) for unreliable congestion control  Q how to measureuse loss rate and RTT

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

High speed TCP

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP in high speed networks

  Example 1500 byte segments 100ms RTT want 10 Gbps throughput

  Requires window size W = 83333 in-flight segments   Throughput in terms of loss rate

  13 p = 210-10 or equivalently at most one drop every couple hours

  New versions of TCP for high-speed networks needed

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCPrsquos long recovery delay

  More than an hour to recover from a loss or timeout

~41000 packets

~60000 RTTs ~100 minutes

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

High-speed TCP

  Proposals  Scalable TCP HSTCP FAST CUBIC  General idea is to use superlinear window

increase  Particularly useful in high bandwidth-delay

product regimes

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Alternate choices of response functions

Scalable TCP - S = 015p

Q Whatever happened to TCP-friendly

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

High speed TCP [Floyd]

  additive increase multiplicative decrease

  increments decrements depend on window size

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Scalable TCP (STCP) [T Kelly]

  multiplicative increase multiplicative decrease

W larr W + a per ACK W larr W ndash b W per window with loss

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

STCP dynamics

From 1st PFLDnet Workshop Tom Kelly13

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Active Queue Management

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Router Queue Management

  normally packets dropped only when queue overflows   ldquodrop-tailrdquo queueing

router Internet

P113P213P313P413P513P613FCFS13

Scheduler13

router

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

The case against drop-tail queue management

  Large queues in routers are ldquoa bad thingrdquo  Delay end-to-end latency dominated by length

of queues at switches in network   Allowing queues to overflow is ldquoa bad thingrdquo

 Fairness connections transmitting at high rates can starve connections transmitting at low rates

 Utilization connections can synchronize their response to congestion

P113P213P313P413FCFS

Scheduler P513P613

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Idea early random packet drop

When queue length exceeds threshold drop packets with queue length dependent probability  probabilistic packet drop flows see same loss

rate  problem bursty traffic (burst arrives when

queue is near threshold) can be over penalized

P113P213P313P413P513P613FCFS

Scheduler

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Random early detection (RED) packet drop

  Use exponential average of queue length to determine when to drop  avoid overly penalizing short-term bursts   react to longer term trends

  Tie drop prob to weighted avg queue length  avoids over-reaction to mild overload conditions

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Random early detection (RED) packet drop

Max threshold

Min threshold

Average queue length

Forced drop

Probabilistic early drop

No drop

Time

Drop probability Max

queue length

10013

Drop probability

maxp13

Weighted AverageQueue Length

min13 max13

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

RED summary why random drop

  Provide gentle transition from no-drop to all-drop  Provide ldquogentlerdquo early warning  Avoid synchronized loss bursts among

sources   Provide same loss rate to all sessions

 With tail-drop low-sending-rate sessions can be completely starved

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Random early detection (RED) today

  Many (5) parameters nontrivial to tune (at least for HTTP traffic)

  Gains over drop-tail FCFS not that significant

  Still not widely deployed hellip

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Why randomization important

  Synchronization of periodic routing updates

  Periodic losses observed in end-end Internet traffic

source Floyd Jacobson 1994

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Router update operation

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive at dest)

start_timer (uniform Tp +- Tr)

timeout or link fail

update

time spent in state depends on msgs

received from others (weak coupling

between routers processing)

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Router synchronization

  20 (simulated) routers broadcasting updates to each other

  x-axis time until routing update sent relative to start of round

  By t=100000 all router rounds are of length 120

  synchronization or lack thereof depends on system parameters

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Avoiding synchronization   Choose random

timer component Tr large (eg several multiples of TC)

prepare own routing

update (time TC)

receive update from neighbor process (time TC2)

wait

receive update from neighbor process

ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough

randomization to avoid

synchronization

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Randomization

  Takeaway message  randomization makes a system simple and

robust

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Background transport TCP Nice

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

What are background transfers

  Data that humans are not waiting for   Non-deadline-critical   Unlimited demand

  Examples  Prefetched traffic on the Web  File system backup  Large-scale data distribution services  Background software updates  Media file sharing

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Desired Properties

  Utilization of spare network capacity

  No interference with regular transfers  Self-interference

bull  applications hurt their own performance  Cross-interference

bull  applications hurt other applicationsrsquo performance

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP Nice

  Goal abstraction of free infinite bandwidth   Applications say what they want

 OS manages resources and scheduling

  Self tuning transport layer  Reduces risk of interference with foreground

traffic  Significant utilization of spare capacity by

background traffic  Simplifies application design

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Why change TCP

  TCP does network resource management  Need flow prioritization

  Alternative router prioritization + More responsive simple one bit priority   Hard to deploy

  Question  Can end-to-end congestion control achieve non-

interference and utilization

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP Nice

  Proactively detects congestion

  Uses increasing RTT as congestion signal  Congestion incr queue lengths incr RTT

  Aggressive responsiveness to congestion

  Only modifies sender-side congestion control  Receiver and network unchanged  TCP friendly

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

TCP Nice

  Basic algorithm   1 Early Detection thresh queue length incr in RTT   2 Multiplicative decrease on early congestion   3 Allow cwnd lt 10 (despite no loss)

  per-ack operation   if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++

  per-round operation   if(numCong gt fW) W W2 else hellip AIMD congestion control

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Nice the works

  Non-interference getting out of the way in time   Utilization maintaining a small queue

pkts

minRTT = τ13 maxRTT = τ+Βmicro13

B

tB Add Mul +

micro

Reno

Nice Add Add Add

Mul +

Mul +

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Network Conditions

01

1

10

100

1e3

1 10 100 Fore

grou

nd D

ocum

ent L

aten

cy (s

ec)

Spare Capacity

Reno

Vegas

V0

Nice

Router Prio

  Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Scalability

01

1

10

100

1e3

1 10 100

Doc

umen

t Lat

ency

(sec

)

Num BG flows

Vegas

V0

Nice

Router Prio

Reno

  W lt 1 allows Nice to scale to any number of background flows

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Utilization

0

2e4

4e4

6e4

8e4

1 10 100

BG

Thr

ough

put (

KB

)

Num BG flows

Router Prio

Vegas

V0

Reno

Nice

  Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Wide-area network experiments

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

What is TCP optimizing

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

How does TCP allocate network resources

  Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation

  How to model the interaction between TCP and the network  Recall PFTK like models assumed network

conditions are not affected by (a single) TCP flow

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Optimization-based approach towards congestion control

Resource allocation as optimization problem   How to allocate resources (eg bandwidth) to

optimize some objective function   Maybe not possible to obtain exact optimality but

 optimization framework as means to explicitly steer network towards desirable operating point

 practical congestion control as distributed asynchronous implementations of optimization algorithm

  systematic approach towards protocol design

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

c1 c2

Model   Network Links l each of capacity cl   Sources s (L(s) Us(xs))

L(s) - links used by source s Us(xs) - utility if source rate = xs

x1

x2 x3

121 cxx le+ 231 cxx le+

Us(xs)

xs

example utility function for elastic application

Q What are possible allocations with say unit capacity links

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Optimization Problem

  maximize system utility (note all sources ldquoequalrdquo)   constraint bandwidth used less than capacity   centralized solution to optimization impractical

 must know all utility functions   impractical for large number of sources  can we view congestion control as distributed

asynchronous algorithms to solve this problem

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0 ldquosystemrdquo problem

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

The user view

  User can choose amount to pay per unit time ws

  Would like allocated bandwidth xs in proportion to ws

euro

max Usw s

ps

⎝ ⎜

⎠ ⎟ minus ws

subject to ws ge 0

  ps could be viewed as charge per unit flow for user s s

ss pwx =

userrsquos utility cost

user problem

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

The network view

  Suppose network knows vector ws chosen by users   Network wants to maximize logarithmic utility function

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

network problem

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Solution existence

  There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that   Ws solves user

problem   Xs solves the

network problem   Xs is the unique

solution to the system problem

sum

sum

isin

ge

leS(l)s

ls

sss

x

cx

x w s

subject to

log max0

0 wsubject to

w Umax

s

ss

ge

minus⎟⎟⎠

⎞⎜⎜⎝

⎛s

s

wp

Llcx

xU

lSsls

sss

xs

isinforalllesum

sum

isin

ge

subject to

)( max

)(

0

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Proportional Fairness

  Vector of rates xs proportionally fair if feasible and for any other feasible vector xs

0

leminus

sumisinSs s

ss

xxx

  Result if wr=1 then Xs solves the network problem IFF it is proportionally fair

  Similar result exists for the case that wr not equal 1

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Max-min Fairness

Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Minimum potential delay fairness

  Rates xr are minimum potential delay fair if Ur (xr) = -wrxr

Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Max-min Fairness

rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp

What is corresponding utility function

α

α

α minus=

minus

infinrarr 1lim)(

1r

rrxxU

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Solving the network problem   Results so far existence - solution exists

with given properties   How to compute solution

 Ideally distributed solution easily embodied in protocol

 Should reveal insight into existing protocol

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

congestion ldquosignalrdquo function of aggregate rate at link l fed back to s

change in bandwidth

allocation at s

linear increase

multiplicative decrease

⎟⎟⎠

⎞⎜⎜⎝

⎛= sum

isin

)()()(txgtp

sLlsllwhere

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Solving the network problem

⎟⎟⎠

⎞⎜⎜⎝

⎛minus= sum

isin

)()()()(tptxwktx

dtd

sLllsss

  Results   converges to solution of relaxation of network

problem  xs(t)Σpl(t) converges to ws

  Interpretation TCP-like algorithm to iteratively solves optimal rate allocation

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Source Algorithm

  Source needs only its path price

  kr() nonnegative nondecreasing function   Above algorithm converges to unique

solution for any initial condition   qr interpreted as lossmarking probability euro

˙ x r = kr (xr )(Ur (xr ) minus qr)

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Proportionally-Fair Controller

If utility function is

then a controller that implements it is given by

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Pricing interpretation

  Can network choose pricing scheme to achieve fair resource allocation

  Suppose network charges price qr ($bit) where qr=sum pl

  Userrsquos strategy spend wr ($sec) to maximize

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Optimal User Strategy

  equivalently

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Simplified TCP-Reno

  suppose

  then

  interpretation minimize (weighted) delay

pTpTp

x 2)1(2asymp

minus=

TxxU 1)( minus=

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Is AIMD special

  Consider a window control as follows  cwnd += acwnd^n when no loss  cwnd -= bcwnd^m when loss  where nltm

  Expected change in congestion window

  Expected change in rate per unit time

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

MIMD (nm)

  Consider the controller

  where

  Then at equilibrium

  Where α = m-n For stability

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Motivation

Congestion Control maximize user

utility

Traffic Engineering minimize network

congestion Given routing Rli how to adapt end rate xi

Given traffic xi how to perform routing Rli

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Congestion Control Model

max sum i Ui(xi) st sumi Rlixi le cl var x

aggregate utility

Source rate xi

Utility Ui(xi)

capacity constraints

Users are indexed by i

Congestion control provides fair rate allocation amongst users

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Traffic Engineering Model

min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul

Cost f(ul)

aggregate cost

Links are indexed by l

Traffic engineering avoids bottlenecks in the network

ul = 1

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Model of Internet Reality

xi Rli

Congestion Control max sumi Ui(xi) st sumi Rlixi le cl

Traffic Engineering min suml f(ul)

st ul =sumi Rlixicl

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

System Properties

  Convergence   Does it achieve some objective   Benchmark

  Utility gap between the joint system and benchmark

max sumi Ui(xi) st Rx le c Var x R

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Multipath TCP

Joint routing and congestion control

  Multipath TCP controller

Joint routing and congestion control

  Multipath TCP controller