Congestion control - UMass Amherstarun/653/lectures/L6.pdf · Causes/costs of congestion: scenario...
Transcript of Congestion control - UMass Amherstarun/653/lectures/L6.pdf · Causes/costs of congestion: scenario...
Congestion control
Lecture 6 CS 653
Why congestion control
Causescosts of congestion scenario 1
two senders two receivers
one router infinite buffers
no retransmission
large delays when congested
throughput staurates
unlimited shared output link buffers
Host A λin original data
Host B
λout
Causescosts of congestion scenario 2
one router finite buffers sender retransmission of lost
packet
finite shared output link buffers
Host A λin original data
Host B
λout
λin original data plus retransmitted data
Causescosts of congestion scenario 2 always (goodput) ldquoperfectrdquo retransmission when only loss
retransmission of delayed (not lost) packet makes larger (than perfect case) for same
λ13in
λ13out = λ13
in λ13out gt
λ13in
λ13out
ldquocostsrdquo of congestion more work (retransmission) for given ldquogoodputrdquo unneeded retransmissions link carries multiple copies of pkt
R2
R2 λin
λ out
b
R2
R2 λin
λ out
a
R2
R2 λin
λ out
c
R4
R3
Causescosts of congestion scenario 3 four senders multihop paths timeoutretransmit
λ13in
Q what happens as and increase λ13
in
finite shared output link buffers
Host A λin original data
Host B
λout
λin original data plus retransmitted data
Causescosts of congestion scenario 3
Another ldquocostrdquo of congestion when packet dropped any ldquoupstream
transmission capacity used for that packet was wasted
Host A
Host B
λout
Two broad approaches towards congestion control
End-end congestion control
no explicit feedback from network
congestion inferred from end-system observed loss delay
approach taken by TCP
Network-assisted congestion control
routers provide feedback to endhosts single bit indicating
congestion (SNA DECbit ATM TCPIP ECN)
explicit rate sender should send
recent proposals [XCP] [RCP] revisit ATM ideas
TCP congestion control
Components of TCP congestion control
Slow start Multiplicatively increase (double) window
Congestion avoidance Additively increase (by 1 MSS) window
Loss Multiplicatively decrease (halve) window
Timeout Set cwnd to 1 MSS Multiplicatively increase (double) retransmission
timeout upon each further consecutive loss
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Why congestion control
Causescosts of congestion scenario 1
two senders two receivers
one router infinite buffers
no retransmission
large delays when congested
throughput staurates
unlimited shared output link buffers
Host A λin original data
Host B
λout
Causescosts of congestion scenario 2
one router finite buffers sender retransmission of lost
packet
finite shared output link buffers
Host A λin original data
Host B
λout
λin original data plus retransmitted data
Causescosts of congestion scenario 2 always (goodput) ldquoperfectrdquo retransmission when only loss
retransmission of delayed (not lost) packet makes larger (than perfect case) for same
λ13in
λ13out = λ13
in λ13out gt
λ13in
λ13out
ldquocostsrdquo of congestion more work (retransmission) for given ldquogoodputrdquo unneeded retransmissions link carries multiple copies of pkt
R2
R2 λin
λ out
b
R2
R2 λin
λ out
a
R2
R2 λin
λ out
c
R4
R3
Causescosts of congestion scenario 3 four senders multihop paths timeoutretransmit
λ13in
Q what happens as and increase λ13
in
finite shared output link buffers
Host A λin original data
Host B
λout
λin original data plus retransmitted data
Causescosts of congestion scenario 3
Another ldquocostrdquo of congestion when packet dropped any ldquoupstream
transmission capacity used for that packet was wasted
Host A
Host B
λout
Two broad approaches towards congestion control
End-end congestion control
no explicit feedback from network
congestion inferred from end-system observed loss delay
approach taken by TCP
Network-assisted congestion control
routers provide feedback to endhosts single bit indicating
congestion (SNA DECbit ATM TCPIP ECN)
explicit rate sender should send
recent proposals [XCP] [RCP] revisit ATM ideas
TCP congestion control
Components of TCP congestion control
Slow start Multiplicatively increase (double) window
Congestion avoidance Additively increase (by 1 MSS) window
Loss Multiplicatively decrease (halve) window
Timeout Set cwnd to 1 MSS Multiplicatively increase (double) retransmission
timeout upon each further consecutive loss
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Causescosts of congestion scenario 1
two senders two receivers
one router infinite buffers
no retransmission
large delays when congested
throughput staurates
unlimited shared output link buffers
Host A λin original data
Host B
λout
Causescosts of congestion scenario 2
one router finite buffers sender retransmission of lost
packet
finite shared output link buffers
Host A λin original data
Host B
λout
λin original data plus retransmitted data
Causescosts of congestion scenario 2 always (goodput) ldquoperfectrdquo retransmission when only loss
retransmission of delayed (not lost) packet makes larger (than perfect case) for same
λ13in
λ13out = λ13
in λ13out gt
λ13in
λ13out
ldquocostsrdquo of congestion more work (retransmission) for given ldquogoodputrdquo unneeded retransmissions link carries multiple copies of pkt
R2
R2 λin
λ out
b
R2
R2 λin
λ out
a
R2
R2 λin
λ out
c
R4
R3
Causescosts of congestion scenario 3 four senders multihop paths timeoutretransmit
λ13in
Q what happens as and increase λ13
in
finite shared output link buffers
Host A λin original data
Host B
λout
λin original data plus retransmitted data
Causescosts of congestion scenario 3
Another ldquocostrdquo of congestion when packet dropped any ldquoupstream
transmission capacity used for that packet was wasted
Host A
Host B
λout
Two broad approaches towards congestion control
End-end congestion control
no explicit feedback from network
congestion inferred from end-system observed loss delay
approach taken by TCP
Network-assisted congestion control
routers provide feedback to endhosts single bit indicating
congestion (SNA DECbit ATM TCPIP ECN)
explicit rate sender should send
recent proposals [XCP] [RCP] revisit ATM ideas
TCP congestion control
Components of TCP congestion control
Slow start Multiplicatively increase (double) window
Congestion avoidance Additively increase (by 1 MSS) window
Loss Multiplicatively decrease (halve) window
Timeout Set cwnd to 1 MSS Multiplicatively increase (double) retransmission
timeout upon each further consecutive loss
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Causescosts of congestion scenario 2
one router finite buffers sender retransmission of lost
packet
finite shared output link buffers
Host A λin original data
Host B
λout
λin original data plus retransmitted data
Causescosts of congestion scenario 2 always (goodput) ldquoperfectrdquo retransmission when only loss
retransmission of delayed (not lost) packet makes larger (than perfect case) for same
λ13in
λ13out = λ13
in λ13out gt
λ13in
λ13out
ldquocostsrdquo of congestion more work (retransmission) for given ldquogoodputrdquo unneeded retransmissions link carries multiple copies of pkt
R2
R2 λin
λ out
b
R2
R2 λin
λ out
a
R2
R2 λin
λ out
c
R4
R3
Causescosts of congestion scenario 3 four senders multihop paths timeoutretransmit
λ13in
Q what happens as and increase λ13
in
finite shared output link buffers
Host A λin original data
Host B
λout
λin original data plus retransmitted data
Causescosts of congestion scenario 3
Another ldquocostrdquo of congestion when packet dropped any ldquoupstream
transmission capacity used for that packet was wasted
Host A
Host B
λout
Two broad approaches towards congestion control
End-end congestion control
no explicit feedback from network
congestion inferred from end-system observed loss delay
approach taken by TCP
Network-assisted congestion control
routers provide feedback to endhosts single bit indicating
congestion (SNA DECbit ATM TCPIP ECN)
explicit rate sender should send
recent proposals [XCP] [RCP] revisit ATM ideas
TCP congestion control
Components of TCP congestion control
Slow start Multiplicatively increase (double) window
Congestion avoidance Additively increase (by 1 MSS) window
Loss Multiplicatively decrease (halve) window
Timeout Set cwnd to 1 MSS Multiplicatively increase (double) retransmission
timeout upon each further consecutive loss
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Causescosts of congestion scenario 2 always (goodput) ldquoperfectrdquo retransmission when only loss
retransmission of delayed (not lost) packet makes larger (than perfect case) for same
λ13in
λ13out = λ13
in λ13out gt
λ13in
λ13out
ldquocostsrdquo of congestion more work (retransmission) for given ldquogoodputrdquo unneeded retransmissions link carries multiple copies of pkt
R2
R2 λin
λ out
b
R2
R2 λin
λ out
a
R2
R2 λin
λ out
c
R4
R3
Causescosts of congestion scenario 3 four senders multihop paths timeoutretransmit
λ13in
Q what happens as and increase λ13
in
finite shared output link buffers
Host A λin original data
Host B
λout
λin original data plus retransmitted data
Causescosts of congestion scenario 3
Another ldquocostrdquo of congestion when packet dropped any ldquoupstream
transmission capacity used for that packet was wasted
Host A
Host B
λout
Two broad approaches towards congestion control
End-end congestion control
no explicit feedback from network
congestion inferred from end-system observed loss delay
approach taken by TCP
Network-assisted congestion control
routers provide feedback to endhosts single bit indicating
congestion (SNA DECbit ATM TCPIP ECN)
explicit rate sender should send
recent proposals [XCP] [RCP] revisit ATM ideas
TCP congestion control
Components of TCP congestion control
Slow start Multiplicatively increase (double) window
Congestion avoidance Additively increase (by 1 MSS) window
Loss Multiplicatively decrease (halve) window
Timeout Set cwnd to 1 MSS Multiplicatively increase (double) retransmission
timeout upon each further consecutive loss
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Causescosts of congestion scenario 3 four senders multihop paths timeoutretransmit
λ13in
Q what happens as and increase λ13
in
finite shared output link buffers
Host A λin original data
Host B
λout
λin original data plus retransmitted data
Causescosts of congestion scenario 3
Another ldquocostrdquo of congestion when packet dropped any ldquoupstream
transmission capacity used for that packet was wasted
Host A
Host B
λout
Two broad approaches towards congestion control
End-end congestion control
no explicit feedback from network
congestion inferred from end-system observed loss delay
approach taken by TCP
Network-assisted congestion control
routers provide feedback to endhosts single bit indicating
congestion (SNA DECbit ATM TCPIP ECN)
explicit rate sender should send
recent proposals [XCP] [RCP] revisit ATM ideas
TCP congestion control
Components of TCP congestion control
Slow start Multiplicatively increase (double) window
Congestion avoidance Additively increase (by 1 MSS) window
Loss Multiplicatively decrease (halve) window
Timeout Set cwnd to 1 MSS Multiplicatively increase (double) retransmission
timeout upon each further consecutive loss
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Causescosts of congestion scenario 3
Another ldquocostrdquo of congestion when packet dropped any ldquoupstream
transmission capacity used for that packet was wasted
Host A
Host B
λout
Two broad approaches towards congestion control
End-end congestion control
no explicit feedback from network
congestion inferred from end-system observed loss delay
approach taken by TCP
Network-assisted congestion control
routers provide feedback to endhosts single bit indicating
congestion (SNA DECbit ATM TCPIP ECN)
explicit rate sender should send
recent proposals [XCP] [RCP] revisit ATM ideas
TCP congestion control
Components of TCP congestion control
Slow start Multiplicatively increase (double) window
Congestion avoidance Additively increase (by 1 MSS) window
Loss Multiplicatively decrease (halve) window
Timeout Set cwnd to 1 MSS Multiplicatively increase (double) retransmission
timeout upon each further consecutive loss
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Two broad approaches towards congestion control
End-end congestion control
no explicit feedback from network
congestion inferred from end-system observed loss delay
approach taken by TCP
Network-assisted congestion control
routers provide feedback to endhosts single bit indicating
congestion (SNA DECbit ATM TCPIP ECN)
explicit rate sender should send
recent proposals [XCP] [RCP] revisit ATM ideas
TCP congestion control
Components of TCP congestion control
Slow start Multiplicatively increase (double) window
Congestion avoidance Additively increase (by 1 MSS) window
Loss Multiplicatively decrease (halve) window
Timeout Set cwnd to 1 MSS Multiplicatively increase (double) retransmission
timeout upon each further consecutive loss
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP congestion control
Components of TCP congestion control
Slow start Multiplicatively increase (double) window
Congestion avoidance Additively increase (by 1 MSS) window
Loss Multiplicatively decrease (halve) window
Timeout Set cwnd to 1 MSS Multiplicatively increase (double) retransmission
timeout upon each further consecutive loss
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Components of TCP congestion control
Slow start Multiplicatively increase (double) window
Congestion avoidance Additively increase (by 1 MSS) window
Loss Multiplicatively decrease (halve) window
Timeout Set cwnd to 1 MSS Multiplicatively increase (double) retransmission
timeout upon each further consecutive loss
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Retransmission timeout estimation
Calculate EstimatedRTT using moving average
Calculate deviation wrt moving average
Timeout = EstimatedRTT + 4DevRTT
EstimatedRTTi = (1- α)EstimatedRTTi-1 + αSampleRTTi
DevRTTi = (1-β)DevRTTi-1 + β|SampleRTTi-EstimatedRTTi-1|
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP Throughput
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP throughput A very very simple model
Whatrsquos the average throughout of TCP as a function of window size and RTT T Ignore slow start Let W be the window size when loss occurs
When window is W throughput is WT Just after loss window drops to W2
throughput to W2T Average throughput 3W4T
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP throughput A very simple model
But what is W when loss occurs
When window is w and queue has q packets TCP is
sending at rate w(T+qC) For maintaining utilization and steady state
Just before loss rate = W(T+QC) = C Just after loss rate = W2T = C For Q = CT (a common thumbrule to set router buffer
sizes) a loss occurs every frac14 (34W)Q = 3W28 packets
Q = queue capacity in number of packets
C = link capacity in packetssec
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
sum=
+=++⎟⎠
⎞⎜⎝
⎛ ++2
0)
2(1
22
W
nnWWWW
sum=
+⎟⎠
⎞⎜⎝
⎛ +=2
021
2
W
nnWW
2)12(2
21
2+
+⎟⎠
⎞⎜⎝
⎛ +=WWWW
WW43
83 2 +=
packets sent per ldquoperiodrdquo =
2
83Wasymp
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Deriving TCP throughputloss relationship
TCP window
size
time (rtt)
W2
W
period
packets sent per ldquoperiodrdquo 2
83Wasymp
1 packet lost per ldquoperiodrdquo implies ploss 23
8W
asymp or lossp
W38
=
rttpackets
43utavg_thrup WB ==
rttpackets221utavg_thrup
losspB ==
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Alternate fluid model
Rate of change of sending rate = term inversely proportional to current rate with probability (1-p) - term proportional to current rate with probability p
In steady state
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP throughput A better loss rate based ldquosimplerdquo model [PFTK]
With many flows loss rate and delay are not affected much by a single TCP flow TCP behavior completely specified by loss
and delay pattern along path (bounded by bottleneck capacity)
Given loss rate p and delay T what is TCPrsquos throughput B packetssec taking timeouts into account
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
What is PFTK modeling
Independent loss probability p across rounds Loss acute triple duplicate acks Bursty loss in a round if some packet lost
all following packets in that round also lost Timeout if lt three duplicate acks received
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
PFTK empirical validation Low loss
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
PFTK empirical validation High loss
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Loss-based TCP
Evolution of loss-based TCP Tahoe (without fast retransmit) Reno (triple duplicate acks + fast
retransmit) NewReno (Reno + handling multiple losses
better) SACK (selective acknowledgment) common
today Q what if loss not due to congestion
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Delay-based TCP Vegas
Uses delay as a signal of congestion Idea try to keep a small constant number of
packets at bottleneck queue Expected = WBaseRTT Actual = WCurRTT Diff = Expected - Actual Try to keep Diff between fixed 1 and 3
More recent FAST TCP based on Vegas Delay-based TCP not widely used today
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP-Friendliness
Can we try MyFavNew TCP Well is it TCP-friendly
Any alternative congestion control scheme needs to coexist with TCP in FIFO queues in the best-effort Internet or be isolated from TCP
To co-exist with TCP it must impose the same long-term load on the network No greater long-term throughput as a function of
packet loss and delay so TCP doesnt suffer Not significantly less long-term throughput or its
not too useful
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP friendly rate control (TFRC)
Use a model of TCPs throughout as a function of the loss rate and RTT directly in a congestion control algorithm
If transmission rate is higher than that given by the model reduce the transmission rate to the models rate
Otherwise increase the transmission rate Eg DCCP (Datagram Congestion Control
Protocol) for unreliable congestion control Q how to measureuse loss rate and RTT
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
High speed TCP
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP in high speed networks
Example 1500 byte segments 100ms RTT want 10 Gbps throughput
Requires window size W = 83333 in-flight segments Throughput in terms of loss rate
13 p = 210-10 or equivalently at most one drop every couple hours
New versions of TCP for high-speed networks needed
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCPrsquos long recovery delay
More than an hour to recover from a loss or timeout
~41000 packets
~60000 RTTs ~100 minutes
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
High-speed TCP
Proposals Scalable TCP HSTCP FAST CUBIC General idea is to use superlinear window
increase Particularly useful in high bandwidth-delay
product regimes
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Alternate choices of response functions
Scalable TCP - S = 015p
Q Whatever happened to TCP-friendly
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
High speed TCP [Floyd]
additive increase multiplicative decrease
increments decrements depend on window size
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Scalable TCP (STCP) [T Kelly]
multiplicative increase multiplicative decrease
W larr W + a per ACK W larr W ndash b W per window with loss
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
STCP dynamics
From 1st PFLDnet Workshop Tom Kelly13
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Active Queue Management
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Router Queue Management
normally packets dropped only when queue overflows ldquodrop-tailrdquo queueing
router Internet
P113P213P313P413P513P613FCFS13
Scheduler13
router
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
The case against drop-tail queue management
Large queues in routers are ldquoa bad thingrdquo Delay end-to-end latency dominated by length
of queues at switches in network Allowing queues to overflow is ldquoa bad thingrdquo
Fairness connections transmitting at high rates can starve connections transmitting at low rates
Utilization connections can synchronize their response to congestion
P113P213P313P413FCFS
Scheduler P513P613
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Idea early random packet drop
When queue length exceeds threshold drop packets with queue length dependent probability probabilistic packet drop flows see same loss
rate problem bursty traffic (burst arrives when
queue is near threshold) can be over penalized
P113P213P313P413P513P613FCFS
Scheduler
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Random early detection (RED) packet drop
Use exponential average of queue length to determine when to drop avoid overly penalizing short-term bursts react to longer term trends
Tie drop prob to weighted avg queue length avoids over-reaction to mild overload conditions
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Random early detection (RED) packet drop
Max threshold
Min threshold
Average queue length
Forced drop
Probabilistic early drop
No drop
Time
Drop probability Max
queue length
10013
Drop probability
maxp13
Weighted AverageQueue Length
min13 max13
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
RED summary why random drop
Provide gentle transition from no-drop to all-drop Provide ldquogentlerdquo early warning Avoid synchronized loss bursts among
sources Provide same loss rate to all sessions
With tail-drop low-sending-rate sessions can be completely starved
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Random early detection (RED) today
Many (5) parameters nontrivial to tune (at least for HTTP traffic)
Gains over drop-tail FCFS not that significant
Still not widely deployed hellip
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Why randomization important
Synchronization of periodic routing updates
Periodic losses observed in end-end Internet traffic
source Floyd Jacobson 1994
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Router update operation
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive at dest)
start_timer (uniform Tp +- Tr)
timeout or link fail
update
time spent in state depends on msgs
received from others (weak coupling
between routers processing)
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Router synchronization
20 (simulated) routers broadcasting updates to each other
x-axis time until routing update sent relative to start of round
By t=100000 all router rounds are of length 120
synchronization or lack thereof depends on system parameters
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Avoiding synchronization Choose random
timer component Tr large (eg several multiples of TC)
prepare own routing
update (time TC)
receive update from neighbor process (time TC2)
wait
receive update from neighbor process
ltreadygt send update (time Td to arrive) start_timer (uniform Tp +- Tr) Add enough
randomization to avoid
synchronization
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Randomization
Takeaway message randomization makes a system simple and
robust
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Background transport TCP Nice
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
What are background transfers
Data that humans are not waiting for Non-deadline-critical Unlimited demand
Examples Prefetched traffic on the Web File system backup Large-scale data distribution services Background software updates Media file sharing
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Desired Properties
Utilization of spare network capacity
No interference with regular transfers Self-interference
bull applications hurt their own performance Cross-interference
bull applications hurt other applicationsrsquo performance
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP Nice
Goal abstraction of free infinite bandwidth Applications say what they want
OS manages resources and scheduling
Self tuning transport layer Reduces risk of interference with foreground
traffic Significant utilization of spare capacity by
background traffic Simplifies application design
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Why change TCP
TCP does network resource management Need flow prioritization
Alternative router prioritization + More responsive simple one bit priority Hard to deploy
Question Can end-to-end congestion control achieve non-
interference and utilization
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP Nice
Proactively detects congestion
Uses increasing RTT as congestion signal Congestion incr queue lengths incr RTT
Aggressive responsiveness to congestion
Only modifies sender-side congestion control Receiver and network unchanged TCP friendly
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
TCP Nice
Basic algorithm 1 Early Detection thresh queue length incr in RTT 2 Multiplicative decrease on early congestion 3 Allow cwnd lt 10 (despite no loss)
per-ack operation if(curRTT gt minRTT + threshold(maxRTT ndash minRTT)) numCong++
per-round operation if(numCong gt fW) W W2 else hellip AIMD congestion control
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Nice the works
Non-interference getting out of the way in time Utilization maintaining a small queue
pkts
minRTT = τ13 maxRTT = τ+Βmicro13
B
tB Add Mul +
micro
Reno
Nice Add Add Add
Mul +
Mul +
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Network Conditions
01
1
10
100
1e3
1 10 100 Fore
grou
nd D
ocum
ent L
aten
cy (s
ec)
Spare Capacity
Reno
Vegas
V0
Nice
Router Prio
Nice causes low interference to foreground Web traffic even when there isnrsquot much spare capacity
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Scalability
01
1
10
100
1e3
1 10 100
Doc
umen
t Lat
ency
(sec
)
Num BG flows
Vegas
V0
Nice
Router Prio
Reno
W lt 1 allows Nice to scale to any number of background flows
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Utilization
0
2e4
4e4
6e4
8e4
1 10 100
BG
Thr
ough
put (
KB
)
Num BG flows
Router Prio
Vegas
V0
Reno
Nice
Nice utilizes 50-80 of spare capacity wo stealing any bandwidth from FG
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Wide-area network experiments
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
What is TCP optimizing
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
How does TCP allocate network resources
Problem Given a network and some number of long-lived TCP connections between different source-destination routes can we model the resulting resource allocation
How to model the interaction between TCP and the network Recall PFTK like models assumed network
conditions are not affected by (a single) TCP flow
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Optimization-based approach towards congestion control
Resource allocation as optimization problem How to allocate resources (eg bandwidth) to
optimize some objective function Maybe not possible to obtain exact optimality but
optimization framework as means to explicitly steer network towards desirable operating point
practical congestion control as distributed asynchronous implementations of optimization algorithm
systematic approach towards protocol design
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
c1 c2
Model Network Links l each of capacity cl Sources s (L(s) Us(xs))
L(s) - links used by source s Us(xs) - utility if source rate = xs
x1
x2 x3
121 cxx le+ 231 cxx le+
Us(xs)
xs
example utility function for elastic application
Q What are possible allocations with say unit capacity links
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Optimization Problem
maximize system utility (note all sources ldquoequalrdquo) constraint bandwidth used less than capacity centralized solution to optimization impractical
must know all utility functions impractical for large number of sources can we view congestion control as distributed
asynchronous algorithms to solve this problem
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0 ldquosystemrdquo problem
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
The user view
User can choose amount to pay per unit time ws
Would like allocated bandwidth xs in proportion to ws
euro
max Usw s
ps
⎛
⎝ ⎜
⎞
⎠ ⎟ minus ws
subject to ws ge 0
ps could be viewed as charge per unit flow for user s s
ss pwx =
userrsquos utility cost
user problem
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
The network view
Suppose network knows vector ws chosen by users Network wants to maximize logarithmic utility function
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
network problem
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Solution existence
There exist prices ps source rates xs and amount-to-pay-per-unit-time ws = psxs such that Ws solves user
problem Xs solves the
network problem Xs is the unique
solution to the system problem
sum
sum
isin
ge
leS(l)s
ls
sss
x
cx
x w s
subject to
log max0
0 wsubject to
w Umax
s
ss
ge
minus⎟⎟⎠
⎞⎜⎜⎝
⎛s
s
wp
Llcx
xU
lSsls
sss
xs
isinforalllesum
sum
isin
ge
subject to
)( max
)(
0
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Proportional Fairness
Vector of rates xs proportionally fair if feasible and for any other feasible vector xs
0
leminus
sumisinSs s
ss
xxx
Result if wr=1 then Xs solves the network problem IFF it is proportionally fair
Similar result exists for the case that wr not equal 1
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Max-min Fairness
Rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Minimum potential delay fairness
Rates xr are minimum potential delay fair if Ur (xr) = -wrxr
Interpretation if wr is file size then wrxr is transfer time optimization problem is to minimize sum of transfer delays
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Max-min Fairness
rates xr max-min fair if for any other feasible rates yr if ys gt xs then exist p such that xp lexs and yp lt xp
What is corresponding utility function
α
α
α minus=
minus
infinrarr 1lim)(
1r
rrxxU
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Solving the network problem Results so far existence - solution exists
with given properties How to compute solution
Ideally distributed solution easily embodied in protocol
Should reveal insight into existing protocol
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
congestion ldquosignalrdquo function of aggregate rate at link l fed back to s
change in bandwidth
allocation at s
linear increase
multiplicative decrease
⎟⎟⎠
⎞⎜⎜⎝
⎛= sum
isin
)()()(txgtp
sLlsllwhere
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Solving the network problem
⎟⎟⎠
⎞⎜⎜⎝
⎛minus= sum
isin
)()()()(tptxwktx
dtd
sLllsss
Results converges to solution of relaxation of network
problem xs(t)Σpl(t) converges to ws
Interpretation TCP-like algorithm to iteratively solves optimal rate allocation
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Source Algorithm
Source needs only its path price
kr() nonnegative nondecreasing function Above algorithm converges to unique
solution for any initial condition qr interpreted as lossmarking probability euro
˙ x r = kr (xr )(Ur (xr ) minus qr)
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Proportionally-Fair Controller
If utility function is
then a controller that implements it is given by
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Pricing interpretation
Can network choose pricing scheme to achieve fair resource allocation
Suppose network charges price qr ($bit) where qr=sum pl
Userrsquos strategy spend wr ($sec) to maximize
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Optimal User Strategy
equivalently
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Simplified TCP-Reno
suppose
then
interpretation minimize (weighted) delay
pTpTp
x 2)1(2asymp
minus=
TxxU 1)( minus=
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Is AIMD special
Consider a window control as follows cwnd += acwnd^n when no loss cwnd -= bcwnd^m when loss where nltm
Expected change in congestion window
Expected change in rate per unit time
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
MIMD (nm)
Consider the controller
where
Then at equilibrium
Where α = m-n For stability
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Motivation
Congestion Control maximize user
utility
Traffic Engineering minimize network
congestion Given routing Rli how to adapt end rate xi
Given traffic xi how to perform routing Rli
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Congestion Control Model
max sum i Ui(xi) st sumi Rlixi le cl var x
aggregate utility
Source rate xi
Utility Ui(xi)
capacity constraints
Users are indexed by i
Congestion control provides fair rate allocation amongst users
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Traffic Engineering Model
min suml f(ul) st ul =sumi Rlixicl var R Link Utilization ul
Cost f(ul)
aggregate cost
Links are indexed by l
Traffic engineering avoids bottlenecks in the network
ul = 1
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
Model of Internet Reality
xi Rli
Congestion Control max sumi Ui(xi) st sumi Rlixi le cl
Traffic Engineering min suml f(ul)
st ul =sumi Rlixicl
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller
System Properties
Convergence Does it achieve some objective Benchmark
Utility gap between the joint system and benchmark
max sumi Ui(xi) st Rx le c Var x R
Multipath TCP
Joint routing and congestion control
Multipath TCP controller