Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

40
Congestion Responsiveness of Internet Traffic (a fresh look at an old problem) Ravi Prasad & Constantine Dovrolis Networking and Telecommunications Group College of Computing, Georgia Tech

description

Congestion Responsiveness of Internet Traffic (a fresh look at an old problem). Ravi Prasad & Constantine Dovrolis Networking and Telecommunications Group College of Computing, Georgia Tech. TCP and Internet stability. Stable network: the offered load stays below the capacity ( ρ

Transcript of Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Page 1: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Congestion Responsiveness of Internet Traffic

(a fresh look at an old problem)

Ravi Prasad&

Constantine Dovrolis

Networking and Telecommunications GroupCollege of Computing,

Georgia Tech

Page 2: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

TCP and Internet stability Stable network: the offered load stays below the

capacity (ρ<1) Otherwise, persistent packet losses Congestion collapse: fully utilized links, but almost

zero per-flow goodput Conventional wisdom #1: the Internet manages to be

stable due to TCP congestion control TCP: more than 90% of Internet traffic TCP reduces offered load (send window) upon signs

of congestion Negative-feedback loop, stabilizing queueing system

Conventional wisdom #2: stability can be maintained without admission control or resource reservations

Page 3: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

TCP-centric congestion control If all flows use TCP, or TCP-friendly congestion control,

then the Internet will be stable TCP congestion control -> no congestion collapse “Promoting the use of end-to-end congestion control in the

Internet”, Floyd & Fall, ToN’99 “Congestion control principles”, Floyd, RFC2914, 2000

Key modeling unit: persistent flows (they last forever!) “Rate control in communication networks: shadow prices,

proportional fairness and stability”, Kelly et al., JORS’98 “Congestion control for high performance, stability, and

fairness in general networks”, Paganini et al., ToN’05 Number of active flows does not change with time Infinitely long flows can be effectively controlled

Page 4: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Flows are generated by users/applications, not by the transport layer!

Examples: user clicks web page, p2p movie download, machine-generated periodic FS synchronization

Session: Set of finite (i.e., non-persistent) flows, generated by single user action

Key issue: session arrival process

Does the session arrival rate reduce during congestion?

Receiver Sender

Transport

Application

ResponseRequest

Network

Page 5: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Two fundamental flow arrival models

Closed-loop model Fixed number of users, each

user can generate one session at a time

New session arrival: depends on completion of previous session

E.g., ingress traffic in campus network (student downloads)

Open-loop model Sessions arrive in network

independently of congestion Theoretically, infinite

population of users E.g., egress traffic at

popular Web server Very different models in

terms of congestion responsiveness & stability

1

2

3

N

Page 6: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Related work Open-loop traffic model

“Statistical bandwidth sharing: a study of congestion at flow level”, Fredj et al., Sigcomm’01

“Stability and performance analysis of networks supporting services”, Veciana et al., ToN’01

Closed-loop traffic model “A new method for the analysis of feedback-based

protocols with applications to engineering web traffic over the Internet”, Heyman et al., Sigmetrics’99

“Dimensioning bandwidth for elastic traffic in high-speed data networks”, Berger & Kogan, ToN’00

Main open issues:1. What do the previous two models imply for the

congestion responsiveness of aggregate Internet traffic?2. Which of the previous two models is closer to real

Internet traffic?

Page 7: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Our contributions Introduce two new metrics for congestion responsiveness

of aggregate Internet traffic Elasticity and instability coefficient

Examine congestion responsiveness of several traffic models, including open-loop, closed-loop, and mixed traffic Open-loop TCP traffic is less congestion responsive than

even UDP traffic! Closed-loop traffic is more congestion responsive than

persistent flows Design experimental methodology to measure Close-loop

Traffic Ratio (CTR) Measure CTR in several Internet packet traces 70-90% of Internet traffic appears to be closed-loop

Several of implications for networking research & practice

Page 8: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Outline Congestion responsiveness metrics

Elasticity Instability coefficient

Results for ideal Processor Sharing (PS) server Closed-loop flow arrival model Open-loop flow arrival model

Congestion responsiveness of four traffic models Persistent TCP flows UDP constant-rate streams Open-loop TCP flows Closed-loop TCP flows

Congestion responsiveness of real network traffic Methodology and measurements

Summary and implications

Page 9: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Elasticity metric Quantifies the extent to which a traffic aggregate

backs off upon a congestion event U and U ’ : average throughput of aggregate

traffic prior and during stimulus, respectively Defined as fractional change in throughput

Depends on congestion event cause Canonical congestion event: a persistent TCP transfer

(stimulus) that is not limited by the receiver’s window

U

UUf

'

Page 10: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

f=1 Completely responsive

f=0 Completely unresponsive

Elasticity

Cross-traffic

Stimulus

Page 11: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Positive elasticity

Negative elasticity When cross traffic increases its rate upon congestion

Elasticity

Cross-traffic

Stimulus

Page 12: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Instability Coefficient Instability coefficient quantifies whether (and

how fast) a traffic aggregate can lead to congestion collapse upon congestion at time t

Defined as (t)=dN(t)/dt N(t) : number of active sessions at time t

≤ 0 Fixed or decreasing number of active sessions Stable network

> 0 Increasing number of active sessions Has the potential to cause congestion collapse Larger faster move towards congestion collapse

Page 13: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Instability Coefficient Simulation of a stable network: = 0

Open-loop model: session arrival rate 200/sec

Page 14: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Instability Coefficient Simulation of an unstable network > 0

Open-loop model: session arrival rate 400/sec

Page 15: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Outline Congestion responsiveness metrics

Elasticity Instability coefficient

Results for ideal Processor Sharing (PS) server Closed-loop flow arrival model Open-loop flow arrival model

Congestion responsiveness of four traffic models Persistent TCP flows UDP constant-rate streams Open-loop TCP flows Closed-loop TCP flows

Congestion responsiveness of real network traffic Methodology and measurements

Summary and implications

Page 16: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Closed-loop model – PS server

N users: cycles of transfer and idle periods S : Average session size TT : Average transfer

duration TI : Average idle time TT increases during

congestion Na: Number of active

sessions Elasticity f = 1/(Na+1) Instability coefficient

cannot be positive indefinitely ( Na<N )

ICT

NS

1,][

1,1

][

S

CTNNE

NE

Ia

a

TIoffered TT

NSR

Page 17: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Open-loop model – PS server Poisson session

arrivals S : Average session size : Session arrival rate Offered load = S/C Stable only if <1

Expected throughput for new transfer: C(1-) : available bw Elasticity f = 0

Instability coefficient: if >1

C

S

1),1(][ CT

SE

SRoffered

Page 18: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Mixed traffic Internet traffic: mix of open-loop and

closed-loop traffic Mixed traffic can be characterized by

Closed-loop Traffic Ratio (CTR)

fmix = CTR* fclosed

mix > 0 when open > 1 Not when open +closed >1

load trafficTotal

model loop closed from load TrafficCTR

Page 19: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Outline Congestion responsiveness metrics

Elasticity Instability coefficient

Results for ideal Processor Sharing (PS) server Closed-loop flow arrival model Open-loop flow arrival model

Congestion responsiveness of four traffic models Persistent TCP flows UDP constant-rate streams Open-loop TCP flows Closed-loop TCP flows

Congestion responsiveness of real network traffic Methodology and measurements

Summary and implications

Page 20: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Persistent TCP transfers N homogenous transfers Stimulus increases RTT and loss

rate from (T,p) to (T’,p’) UMass model to estimate TCP

average throughput

Number of transfers remains constant, i.e., = 0

1

1

23

'23

'1

N

bpTNM

bpTNM

f

Page 21: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Constant-rate UDP transfers Fixed number of constant-rate flows

UDP flows do not react to congestion, and they do not retransmit lost packets

Throughput after stimulus: U’= (1-p)U Elasticity f = p >0 Truly congestion responsive traffic should have

larger elasticity than loss rate Instability coefficient is zero

Number of flows does not change during congestion

Cannot cause congestion collapse

Page 22: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Open-loop TCP transfers Poisson stream of TCP flows

Size uniformly distributed between 16-20pkts

Arrival rate chosen to vary offered load

Ideally, f=0 when <1 But, negative elasticity is

possible with TCP redundant retransmissions Increased offered load

after stimulus is positive when >1

Possible congestion collapse Open-loop traffic is net’s

worse enemy

Page 23: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Closed-loop TCP transfers When loss rate ~ 0 (i.e.,

small number of sessions) Stimulus increases RTT

from T to T’ Transfer latency

increases from kT to kT’

With small number of active sessions: Elasticity: about constant

With large number of active sessions: Elasticity > 1/(Na+1) Closed-loop TCP traffic:

more elastic than persistent flows

ITkT

TTkf

'

)'(

Page 24: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Summary

Traffic class Elasticity Stability

Persistent TCP elastic f=1/(N+1)N homogenous flows

stable

UDP const-rate inelastic f=pp: loss rate

stable

Open-loop TCP inelastic f≤0 unstable if > 1

Closed-loop TCP elastic f>1/(Na+1) stable

Page 25: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Outline Congestion responsiveness metrics

Elasticity Instability coefficient

Results for ideal Processor Sharing (PS) server Closed-loop flow arrival model Open-loop flow arrival model

Congestion responsiveness of four traffic models Persistent TCP flows UDP constant-rate streams Open-loop TCP flows Closed-loop TCP flows

Congestion responsiveness of real network traffic Methodology and measurements

Summary and implications

Page 26: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

What to measure? Direct elasticity measurements require

packet traces at bottleneck during stimulus We have access to only a couple of such links

Direct measurements of instability coefficient require packet traces during congestion events We have access to only a couple of congested links

Alternative: Measure CTR (closed-loop traffic ratio) Indirect metric for congestion responsiveness High CTR (close to one): mostly closed-loop traffic Low CTR (close to zero): mostly open-loop traffic

Page 27: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

CTR estimation (overview) Start with packet trace from Internet link

Per-packet: arrival time, src/dst address & ports, size Focus only on TCP traffic: HTTP and well-known ports

Identify users: Downloads: user is associated with unique DST address Uploads: user is associated with unique SRC address Multi-user hosts and NATs is a problem (see paper for

details) For each user, identify sessions:

Session: one or more connections (“jobs”) associated with same user action E.g., Web page download: multiple HTTP

connections Classify sessions as open-loop or closed-loop:

Successive sessions from same user: closed-loop Session from a new user, or session arriving from

known user after a long idle period: open-loop

Page 28: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

From Connections to Jobs to Sessions An HTTP 1.1 connection

can stay alive across multiple sessions

Job : Segment of TCP connection that belongs to a single session

Intra-job packet interarrivals: TCP and network-dependent (short)

Inter-job packet interarrivals: caused by user actions (long)

Classify interarrivals based on Silence Threshold (STH)

1105126179.423931 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.478309 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.478438 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.478554 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.488433 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.488666 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.488918 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.539748 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.539870 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.539993 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.549085 163.157.239.61 127.207.1.255 80 2290 154 T 114

1105126179.549399 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.611572 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.611702 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.612235 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.612507 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.612752 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.613121 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.672432 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

Inter job gap Intra job gap

Page 29: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Silence Threshold (STH) estimation

Inter job gap Intra job gap

Page 30: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Group jobs from same user in sessions Intuition: jobs from

same session will have short interarrivals (machine-generated)

Minimum Session Interarrival (MSI) threshold

MSI aims to distinguish machine-generated from user-initiated events MSI = 1-5 seconds

1105126179.423931 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.478309 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.478438 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.478554 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.488433 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.488666 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.488918 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.539748 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.539870 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.539993 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.549085 163.157.239.61 127.207.1.255 80 2290 154 T 114

1105126179.549399 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.611572 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.611702 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.612235 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.612507 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.612752 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.613121 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.672432 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

Inter job gap Intra job gap

<MSI >MSI

session 1 session 2 session 3

Page 31: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Classify sessions as open/closed-loop First session from a user is

always open-loop Session from a returning

user is also open-loop, if it starts more than MTT seconds since completion of last session

MTT: Maximum Think Time Typically, MTT would be

several minutes

1105126179.423931 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.478309 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.478438 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.478554 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.488433 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.488666 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.488918 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.539748 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.539870 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.539993 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.549085 163.157.239.61 127.207.1.255 80 2290 154 T 114

1105126179.549399 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.611572 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.611702 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.612235 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.612507 163.157.239.61 127.207.1.255 80 2289 1420 T 1380

1105126179.612752 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.613121 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

1105126179.672432 163.157.239.61 127.207.1.255 80 2290 1420 T 1380

Inter job gap Intra job gap

<MSI >MSI

session 1Open

session 2Open

session 3Close

> MTT < MTT

Page 32: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Robustness to MSI & MTT thresholds Examined CTR variation

in the following ranges: MSI: 0.1sec-2sec MTT : 10min-25min

CTR variation < 0.05 Linear regression:

CTR/MSI = -0.0044/sec CTR/MTT = 0.0037/min

We use: MSI=1 Sec. MTT=15 Min.

Page 33: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Sample CTR measurementsLink location

Year Direction Duration

TCP HTTP Download Well-known ports

GB(%) Bytes(%) CTR Bytes(%) CTR

Georgia Tech.

05 In 2Hr. 129(97) 44.7 0.90 18.8 0.60

Out 2Hr. 208(99) 37.3 0.63 10.6 0.70

Los Nettos

04 Core 1Hr. 59(95) 36.2 0.93 29.3 0.83

UNC, Chapel Hill

03 In 1Hr. 41(87) 22.9 0.95 3.6 0.69

Out 1Hr. 153(97) 19.0 0.76 16.8 0.91

Abilene, Indianapolis

02 Core 1Hr. 172(96) 8.0 0.78 33.9 0.91

Core 1Hr. 178(85) 11.5 0.82 35.8 0.89

Univ. of Auckland, NZ

01 In 6Hr. 0.6(95) 42.4 0.92 30.6 0.24

Out 6Hr. 1.4(98) 70.4 0.79 7.6 0.72

Page 34: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Outline Congestion responsiveness metrics

Elasticity Instability coefficient

Results for ideal Processor Sharing (PS) server Closed-loop flow arrival model Open-loop flow arrival model

Congestion responsiveness of four traffic models Persistent TCP flows UDP constant-rate streams Open-loop TCP flows Closed-loop TCP flows

Congestion responsiveness of real network traffic Methodology and measurements

Summary and implications

Page 35: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Summary Persistent transfers have very different

congestion responsiveness than finite-size transfers Focus on open-loop and closed-loop flow arrivals

TCP or TCP-like protocols are not sufficient to avoid congestion collapse

Negative feedback at session/application layer holds key for network stability

Measurements show high CTR values for most Internet links we examined Possibly why Internet is mostly stable

Page 36: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Is AQM an effective controller?

Active Queue Management (AQM) Most AQM models assume persistent TCP flows

Provides congestion signal to flows Stabilizes buffer occupancy Controls link utilization

However, AQM is ineffective controller in presence of open-loop TCP traffic

Flow arrival process does not react to AQM drops

Congestion collapse still possible with AQM

Page 37: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Is admission control necessary? Admission control is an effective way to

control the offered load with open-loop traffic Avoids flow aborts and reattempts See proposals by J. Roberts and others

However, admission control is not required with closed-loop traffic Closed-loop traffic is self-regulating As long as the maximum possible number of

active sessions does not exceed a certain threshold

Page 38: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

What about TCP-friendliness?

“TCP friendliness” has been proposed for all non-TCP traffic as a way to avoid congestion collapse However, like TCP, open-loop TCP

friendly sessions can still cause congestion collapse

TCP friendliness is more important for fairness reasons (share bw almost equally with TCP)

Page 39: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Traffic models for simulations-analysis

Time to drop the persistent flows assumption! It is not realistic It has very different congestion responsiveness

than real Internet traffic More realistic aggregate traffic models:

Mix of both open-loop and closed-loop finite-size sessions

We need more CTR measurements to characterize the mix

We need mathematical models for closed-loop traffic behavior, considering user behavior under congestion

Page 40: Congestion Responsiveness of Internet Traffic (a fresh look at an old problem)

Session/application congestion control Several existing applications generate

sessions independent of network congestion (bad!) Example-1: NNTP servers transfer news periodically Example-2: CDN servers exchange content as

needed or periodically Client-side control mechanism:

Do not start new session before current session completes

Server-side control mechanism: Use admission control when number of active

sessions exceeds threshold