State-Slice: New Paradigm of Multi-query Optimization of Window-based Stream Queries

State-Slice: New Paradigm of Multi-query Optimization ofWindow-based Stream Queries

Song Wang

Elke Rundensteiner

Database Systems Research Group

Worcester Polytechnic Institute

Worcester, MA, USA.

Samrat Ganguly

Sudeept Bhatnagar

NEC Laboratories America Inc.

Princeton, NJ, USA.

http://www.wpi.edu/

32nd VLDB Conference, Seoul, Korea, 2006 2

Computation Sharing for Stream Processing

RegisterContinuous

Queries

Streaming Data

Streaming Result

σ

П

σ

σ

New Challenges:• In-memory processing of stateful operators • Stateful operators with various window constraints

Agg

SPJA Query Network

w1

w2

w3

Agg


Window Constraints for Stateful Operators Time-based sliding window constraints

Each tuple has a timestamp Only tuples within W timeframe can form an output

Buffer A Buffer B

A[w]

A B

B[w]

Observations:• States in the operator dominate memory usage• State size is proportional to the input rate and window length• Join CPU cost is proportional to the state size


A Motivation ExampleQ1:SELECT A.*FROM Temperature A, Humidity BWHERE A.LocationId= B.LocationIdWINDOW w1 min

Q2:SELECT A.*FROM Temperature A, Humidity BWHERE A.LocationId= B.LocationId AND A.Value>ThresholdWINDOW w2 min

A[w1]

Q1

A B

B[w1]

Q2

σA

A

B

A[w2] B[w2]

Observations:• State A[W1] overlaps with state A[W2] • State B[W1] overlaps with state B[W2]• Joined results of Q1 and Q2 overlap

Let: w1<w2


Sharing with Selection Pull-up [CDF02, HFA+03]

+

Selection pull up Using larger window (w2)

A[w1]

Q1

A B

B[w1]

Q2

σA

A

B

A[w2] B[w2]

all

Q2 Q1

|Ta-Tb |<W1

Router

B

σA

A

R

A[w2] B[w2]

A B

A[w2] B[w2]

σA

Q2

[CDF02]: J. Chen, D. J. DeWitt, and J. F. Naughton. Design and evaluation of alternative selection placement strategies in optimizing continuous queries. In ICDE’02.[HFA+03]: M. A. Hammad, M. J. Franklin, W. G. Aref, and A. K. Elmagarmid. Scheduling for shared window joins over data streams. In VLDB’03.


Pros Single Join Operator

Cons Wasted Computation without Early Filtering Wasted State Memory without Early Filtering Per Output-Tuple Routing Cost

Sharing with Selection Pull-up [CDF02, HFA+03]


Split stream A by A.Value Route shared join results

Stream Partition with Selection Pushdown [KFH04]

+

A[w1]

Q1

A B

B[w1]

Q2

σA.Value>Threshold

A

B

A[w2] B[w2]

A1

Router

>

all

BA

Threshold

<=

U

B1

Split

1

A2 B2

2

Q2 Q1

|Ta-Tb |Union R

S

A[w1] B[w1] A[w2] B[w2]

<W1

[KFH04]: S. Krishnamurthy, M. J. Franklin, J. M. Hellerstein, and G. Jacobson. The case for precision sharing. In VLDB’04.


Pros Selection pushdown: no wasted Join

Computation Cons

Multiple Join Operators Duplicated State Memory in Multiple Join

Operators Per Output-Tuple Routing Cost

Stream Partition with Selection Pushdown [KFH04]


State-Slice: New Sharing Paradigm

Key Ideas: State-Slice Concept for Sliding Window Join Pipelined Chain of Join Slices

Prospective Benefit: Fine-grained Selection Push-down Pipelined Join Operators Avoiding Per-tuple Routing Cost


One-way State Sliced Window Join

State of Stream A: [w1, w2]

Probe

A Tuple

B Tuple

Joined-Result

Purged-A-Tuple

Propagated-B-Tuple

Iower bound of sliding window: [w1,w2] B tuple only probes A tuples that are “older” at least W1, but at

most W2, than itself


The Chain of One-way State-Sliced Joins

Split state memory into chain of joins No overlap of state memory in chain of joins

Queue(s)State of Stream A: [0, w1]

Probe

A Tuple

B TupleJ1 J2


Probe

UUnion

Joined-Result

=


female

female

From One-way to Two-way Binary Join

Intuitively a combination of two one-way join Two references for each A or B tuples

Male tuples are used to probe states Female tuples are inserted and cross-purged to

respective states

State of Stream A: [0, w1]

State of Stream B: [0, w1]

Queue(s)

A Tuple

B Tuple

J1

J2

UUnion

Joined-Result

State of Stream B: [w1, w2]


male

male


State-Sliced Join Chain: The Example

States of sliced joins in a chain are disjoint with each other Minimize State Memory Usage

Selection can be pushed down into middle of join chain Avoid Unnecessary Resource Waste

No routing step is needed Avoid Per Output-Tuple Routing Cost Completely

A1B1

BA

[0,W1] 1

A2 B2

2

Q2 Q1

U UnionσA

s

s

σA

[0,W1]

[W1,W2] [W1,W2]+Q2

σA

A

B

A[w2] B[w2]

Q1

A[w1]

A B

B[w1]

Q1

A[w1]

A B

B[w1]


Summary: State-Sliced Join Chain

Pros: Minimized Memory Usage Reduced Routing Cost No Need of Operator Synchronization in the Chain

Cons: Stream traffic between pipelined joins Purge cost


Sharing via Chains: Memory-Optimal Chain

U

UU

s s

[w1,w2]BA

1

Q1

[0,w1]2

Q2

s

[wN-1,wN]N

…

Union

… QN

Union

s

[w2,w3]3

Q3

Union …

U

s s

[w1,w2]BA

1

Q1

[0,w1]2

Q2

s

[wN-1,wN]N

…

U Union

… QN

U Union

s

[w2,w3]3

Q3

Union …

σ’1

σ1

σ’2

σ’2

σ2 σ3

σ’3

σ’3

σN

σN

No Selection:

With Selection:


Mem-Optimal Chain CPU-Optimal Chain?

s s

[w1,w2]BA

1

Q1

[0,w1]2

Q2

U Union

s

[w2,w3]3

Q3

U Union

s

[w3,w4]4

Q4

U Union

s

[w4,w5]5

Q5

U Union

Overheads: Too many operators may increase system context switch cost Too many sliced states increase purging cost


Merging Sliced Joins

Tradeoff: Gain from Merging

Reduce number of Join operators Reduce extra purging cost

Loss from Merging Introduce routing cost Increase memory usage due to selection pullup

Cost Model for CPU Usage

si

Qi

U Union

… s

[wj-1,wj]

Qj

U Union

……

…

…

[wi-1,wi]

j

Qi

U Union

… s

[wi-1,wj]

Qj

U Union

…

<wi

|Ta-Tb |R Router

≥wj-1

i

…

…

…


CPU-Opt. Chain: Search Space & Solution

v0 v1 v2 v5v3

w0 w1w2 w3

w5

v4

w4

s s

[w2,w3]BA

1

[0,w2]2

Q3

U Union

s

[w3,w5]3

Q4

U Union

Q2

<w1

|Ta-Tb | RRouter

Q1

<w4

|Ta-Tb |R Router

Q5

U Union

Legend:Vi: window start/end timeVi toVj : one slice window

Shortest path problem


Summary: Mem-Opt. vs. CPU-Opt. Join Chain

Mem-Optimal: Minimized Memory Usage Higher System Overhead Higher Purging Cost

CPU-Optimal: Minimized CPU Usage More Memory Usage if Selection is Pulled Up to

Merge Slices.

Selection PullUp Sharing Mem-Opt. Chain

CPU-Opt. ChainState Slice State Merge


Experimental WPI Stream Engine: CAPE

Software DemonstrationVLDB’04

Operator Configurator

Operator Scheduler

Plan Reoptimizer

CAPE Query Engine

QoS Inspector

Execution Engine

Storage Manager

StreamSender

Stream Feeder

Stream Receiver

Internet

Control Flow

Data Flow

Legend:

Distribution Manager

Query PlanGenerator

Stream / QueryRegistration

GUI

Query 2 . . Query nQuery 1

Streaming Data

End User


Experiment Study 1: Memory Consumption


Experiment Study 2: Total Service Rate


Experiment Study 3: Mem-Opt. vs. CPU-Opt.

Window Distributions Used for 12 Queries.

Small-Large: 12 Queries Small-Large: 24 Queries


Conclusion

Pipelined state sliced join chain Mem-Optimal chain construction CPU-Optimal chain construction Implemented in CAPE Performance evaluation


Thank You!

Visit CAPE Homepage

http://davis.wpi.edu/dsrg/CAPE/index.html

Supported by:

CRI grant CNS 05-51584

State-Slice: New Paradigm of Multi-query Optimization of Window-based Stream Queries

Documents

Transcript of State-Slice: New Paradigm of Multi-query Optimization of Window-based Stream Queries