Near Optimal Streaming algorithms for Graph Spanners Surender Baswana IIT Kanpur.

58
Near Optimal Streaming algorithms for Graph Spanners Surender Baswana IIT Kanpur

Transcript of Near Optimal Streaming algorithms for Graph Spanners Surender Baswana IIT Kanpur.

Near Optimal Streaming algorithms for Graph Spanners

Surender Baswana

IIT Kanpur

Graph spanner :

a subgraph which is sparse and still preserves all-pairs approximate distances.

t-spanner

G=(V,E) : an undirected graph, |V|=n, |E|=m, t > 1

δ(u,v) : distance between u and v in G.

A subgraph GS= (V,ES), where ES is a subset of E such that

for all u,v ε V,

δ(u,v) ≤ δS(u,v) ≤ t δ(u,v)

t : stretch of the spanner.

Sparseness versus stretch

• Consider a graph modeling some network

• Edges correspond to possible links.

• Each edge has certain cost.

Aim : to select as few edges as possible without

increasing the pair wise distance too much.

t-spanner

• Computing a t-spanner of smallest possible size is NP-complete.

• For a graph on n vertices, how large can a t-spanner be ?

vu

t-spanner

• Computing a t-spanner of smallest possible size is NP-complete.

• For a graph on n vertices, how large can a t-spanner be ?

u v

t-spanner

• Computing a t-spanner of smallest possible size is NP-complete.

• For a graph on n vertices, how large can a t-spanner be ?

u v 2-spanner may require Ω(n2) edges

t-spanner

• [Erdös 1963, Bollobas, Bondy & Simonovits]

“There are graphs on n vertices for which every 2k-spanner or

a (2k-1)- spanner has Ω(n1+1/k) edges.”

G=(V,E) ALGORITHM

GS=(V,ES), |ES|=O(n1+1/k) GS is (2k-1)-spanner

Algorithms for t-spanner(RAM model)

Stretch Size Running time

Das et al., 1991

2k-1 O(n1+1/k) O(mn1+1/k)

Deterministic

Roditty et al.

2004

2k-1 O(n1+1/k) O(n2+1/k)

Deterministic

B & Sen, 2003

2k-1 O(kn1+1/k) O(km)

Randomized

Roditty et al., 2005

2k-1 O(kn1+1/k) O(km)

Deterministic

Algorithms for t-spanner(RAM model)

Stretch Size Running time

Das et al., 1991

2k-1 O(n1+1/k) O(mn1+1/k)

Deterministic

Roditty et al.

2004

2k-1 O(n1+1/k) O(n2+1/k)

Deterministic

B & Sen, 2003

2k-1 O(kn1+1/k) O(km)

Randomized

Roditty et al., 2005

2k-1 O(kn1+1/k) O(km)

Deterministic

• avoids distance computation altogether.• near optimal algorithms in parallel, external-memory, distributed environment

Computing a t-spanner in streaming environment

Input : n, m, k, and a stream of edges of an unweighted graph

Aim : to compute a (2k-1)-spanner

Efficiency measures :

1. number of passes

2. space (memory) required

3. time to process the entire stream

Computing a t-spanner in streaming environment

Input : n, m, k, and a stream of edges of an unweighted graph

Aim : to compute a (2k-1)-spanner

Algo 1 : Streaming model

Efficiency measures :

1. number of passes 1

2. space (memory) required O(kn1+1/k)

3. time to process the entire stream O(m)

Computing a t-spanner in streaming environment

Input : n, m, k, and a stream of edges of an unweighted graph

Aim : to compute a (2k-1)-spanner

[Feigenbaum et al., SODA 2005]

Efficiency measures :

1. number of passes 1

2. space (memory) required O(kn1+1/k) for (2k+1)-spanner

3. time to process the entire stream O(mn1/k)

Computing a t-spanner in streaming environment

Input : n, m, k, and a stream of edges of a weighted graph

Aim : to compute a (2k-1)-spanner

Algo 2 : StreamSort model

Efficiency measures :

1. number of passes O(k)

2. working memory required O(log n) bits

3. time spent in one stream pass O(m)

Relation to previous results

B. & Sen, 2003 Feigenbaum et al.,

2005 Algo 1

Algo 2

• slightly different hierarchy • simple buffering technique

Algorithm 1

Intuition

u

Intuition

Spanner edge

u

Intuition

Spanner edge

u

Cluster

u

v

o

C(x) : center of cluster containing x

Radius : maximum distance from center to a vertex in the cluster

Clustering : a set of disjoint clusters

0

1

2

K

K-1

Preprocessing : Clustering for the initial (empty) graph

0

1

2

K

K-1

Sampling probability = n-1/k

Preprocessing : Clustering for the initial (empty) graph

0

1

2

K

K-1

Sampling probability = n-1/k

Preprocessing : Clustering for the initial (empty) graph

0

1

2

K-1

K

Sampling probability = n-1/k

n

n1-1/k

n1-2/k

n1/k

0

Preprocessing : Clustering for the initial (empty) graph

0

1

2

K-1

K

n

n1-1/k

n1-2/k

n1/k

0

Sampling probability = n-1/k

Preprocessing : Clustering for the initial (empty) graph

Processing the stream of edges

• Each vertex u at level i<k-1 wishes to move to higher levels.

Condition for upward movement :

“an edge (u,v) such that Ci(v) is a sampled cluster”

0

1

2

K-1

K

uv

v

0

1

2

K-1

K

uv

v

0

1

2

K-1

K

uv

v

u

0

1

2

K-1

K

uv

v

u

yx

x

x

0

1

2

K-1

K

uv

v

u

yx

x

x

y

0

1

2

K-1

K

uv

v

u

yx

x

x

y

y

0

1

2

K-1

K

uv

v

u

yx

x

x

y

y

0

1

2

K-1

K

uv

v

u

yx

x

x

y

yu

0

1

2

K-1

K

u

i

From perspective of a vertex u …

u

i

From perspective of a vertex u …

u

i

From perspective of a vertex u …

u

i

From perspective of a vertex u …

u

i

From perspective of a vertex u …

u

i

u

x

x

y

yi+1

From perspective of a vertex u …

Processing an edge (u,v)

If Ci(v) is a sampled cluster : Ci+1(u) Ci+1(v);

add (u,v) to spanner;

u moves to level i+1 (or even higher)

Else if Ci(v) was not adjacent to u earlier :

add edge (u,v) to spanner;

Else Discard (u,v)

u

i

u

x

x

y

yi+1

0

1

2

K-1

K

n

n1-1/k

n1-2/k

n1/k

0

Size and stretch of spanner

• Expected number of spanner edges contributed by a vertex

= O(k n1/k).

• Radius of a cluster at level i is at most i.

For each edge discarded, there is a path in spanner

of length (2i+1)

u

i

Size and stretch of spanner

• Expected number of spanner edges contributed by a vertex

= O(k n1/k).

• Radius of a cluster at level i is at most i.

A single pass streaming algorithm

A (2k-1)-spanner of expected size O(kn1+1/k)

Running time of the algorithm

u

i

If Ci(v) is a sampled cluster : Ci+1(u) Ci+1(v); add (u,v) to spanner; u moves to level i+1 (or even higher)Else if Ci(v) was not adjacent to u earlier θ(n1/k) time add edge (u,v) to spanner; Else Discard (u,v)

v

Slight modification

• Each vertex u keeps two buffers for storing edges incident from clusters at its present level.

1. Temp(u)

2. Es(u)

• Whenever u moves to higher level, move all the edges of

Temp(u) and Es(u) to the spanner.

Modified algorithm

i

If Ci(v) is a sampled cluster : Ci+1(u) Ci+1(v); add (u,v) to spanner; u moves to level i+1 (or even higher)

Else add (u,v) to Temp(u) and Prune(u) if Temp(u) ≥ ES(u)

u v

u

Adding edges to Temp(u)

u

Adding edges to Temp(u)

u

Prune(u)

u

Time complexity analysis

• Prune(u) can be executed in O(|Temp(u)| + |Es(u)|) time using an

an auxiliary O(n) space.

• when is Prune(u) executed ?

Time complexity analysis

• Prune(u) can be executed in O(|Temp(u)| + |Es(u)|) time using an

an auxiliary O(n) space.

• Prune(u) is executed only when |Temp(u)| ≥ |Es(u)|

Time complexity analysis

• Prune(u) can be executed in O(|Temp(u)| + |Es(u)|) time using an

an auxiliary O(n) space.

• Prune(u) is executed only when |Temp(u)| ≥ |Es(u)|

• We can charge O(1) cost to each edge in Temp(u).

Time complexity analysis

• Prune(u) can be executed in O(|Temp(u)| + |Es(u)|) time using an

an auxiliary O(n) space.

• Prune(u) is executed only when |Temp(u)| ≥ |Es(u)|

• We can charge O(1) cost to each edge in Temp(u).

• An edge is processed in Temp(u) at most once.

Time complexity analysis

• Prune(u) can be executed in O(|Temp(u)| + |Es(u)|) time using an

an auxiliary O(n) space.

• Prune(u) is executed only when |Temp(u)| ≥ |Es(u)|

• We can charge O(1) cost to each edge in Temp(u).

• An edge is processed in Temp(u) at most once.

Total time spent in processing the stream = O(m)

Size of (2k-1)-spanner

• Expected size of |Es(u)| = O(n1/k)

• Temp(u) never exceeds |Es(u)| +1.

Expected size of (2k-1)-spanner is O(k n1+1/k)

Conclusion

THEOREM 1 :

Given any k ε N, a (2k-1)-spanner of expected size O(kn1+1/k) for any unweighted graph can be computed in one Stream pass with O(m)

time to process the entire stream of edges.

THEOREM 2 :

Given any k ε N, a (2k-1)-spanner of expected size O(kn1+1/k) for any weighted graph can be computed in O(k) StreamSort passes with O(log n) bits of working memory.