QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor...

46
1/36 On Tarski’s Relation Algebra QUERYING TREES AND CHAINS and THE SEMI-JOIN ALGEBRA Jelle Hellings

Transcript of QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor...

Page 1: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

1/36

On Tarski’s Relation Algebra

QUERYING TREES AND CHAINS

and

THE SEMI-JOIN ALGEBRA

Jelle Hellings

Page 2: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

2/36

Part I

On Tarski’s Relation Algebra

INTRODUCTION

Page 3: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

3/36

Relation algebra: graphs and queries

Alice

Bob

Carol

ParentOfParentOf

Dan

ParentOf

Faythe

ParentOf

Grace

ParentOf

PeggyFriendOf FriendOf

Victor

FriendOf

WorksWith

WendyFriendOf

FriendsOfGgp = π1[ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf

Page 4: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

3/36

Relation algebra: graphs and queries

Alice

Bob

Carol

ParentOfParentOf

Dan

ParentOf

Faythe

ParentOf

Grace

ParentOf

PeggyFriendOf FriendOf

Victor

FriendOf

WorksWith

WendyFriendOf

FriendsOfGgp = π1[ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf

Page 5: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

3/36

Relation algebra: graphs and queries

Alice

Bob

Carol

ParentOfParentOf

Dan

ParentOf

Faythe

ParentOf

Grace

ParentOf

PeggyFriendOf FriendOf

Victor

FriendOf

WorksWith

WendyFriendOf

FriendsOfGgp = π1[ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf

Page 6: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

3/36

Relation algebra: graphs and queries

Alice

Bob

Carol

ParentOfParentOf

Dan

ParentOf

Faythe

ParentOf

Grace

ParentOf

PeggyFriendOf FriendOf

Victor

FriendOf

WorksWith

WendyFriendOf

FriendsOfGgp = π1[ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf

Page 7: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

3/36

Relation algebra: graphs and queries

Alice

Bob

Carol

ParentOfParentOf

Dan

ParentOf

Faythe

ParentOf

Grace

ParentOf

PeggyFriendOf FriendOf

Victor

FriendOf

WorksWith

WendyFriendOf

FriendsOfGgp = π1[ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf

Page 8: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

4/36

Relation algebra: other operators

ChildOf = ParentOf a

AcquaintanceOf = FriendOf ∪WorksWith

WorkFriendOf = FriendOf ∩WorksWith

NonWorkFriend = FriendOf −WorksWith

Parent = π1[ParentOf ]

NonParent = π 1[ParentOf ]

GrandParentOf = ParentOf ◦ ParentOf

AncestorOf = [ParentOf ]+

Also: constants ∅, id, di.

Page 9: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

5/36

Graph query languages

∗ ∅ id ∪ ◦ a π π ∩ − di

Tarski’s Relation Algebra, FO[3]

RPQs

2RPQs

Nested RPQs

Navigational XPath, GXPath

Page 10: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

6/36

�estions about the relation algebra

�estions

I Are all these operators necessary?I What does each operator add?I When can we replace complex operators by simpler ones?

What is the expressive power of fragments of the relation algebra?

Page 11: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

6/36

�estions about the relation algebra

�estions

I Are all these operators necessary?I What does each operator add?I When can we replace complex operators by simpler ones?

What is the expressive power of fragments of the relation algebra?

Page 12: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

7/36

Related work and our results

Previous work by Fletcher et al.Relative expressive power of relation algebra fragments on graphs.

This workImprove understanding of the relation algebraI when querying trees or chains;I in comparison with the semi-join algebra.

Page 13: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

8/36

Background: when are queries equivalent?

Two types of queries and equivalences

Path-queries. The exact query result is important:

FriendOf ∩WorksWith ≡path FriendOf − (FriendOf −WorksWith).

Boolean-queries. The existence of a query result is important:

FriendOf a ◦ ParentOf ≡bool π1[FriendOf ] ∩ π1[ParentOf ].

DefinitionLanguage L1 is z-subsumed by L2 if every query in L1 isz-equivalent to a query in L2 (denoted by L1 �z L2).

Page 14: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

9/36

Background: query fragments

Let F ⊆ {di, a,π ,π ,∩,−, ∗}.I We write N(F): only allows ∅, id, `, ◦, ∪, and all operators in F.I We write N(F) to represent all operators expressible in N(F)

using the following basic rewrite rules:

π1[e] = π j[π 1[e]] = (e ◦ [e]−1) ∩ id = (e ◦ (id ∪ di)) ∩ id;

π2[e] = π j[π 2[e]] = ([e]−1 ◦ e) ∩ id = ((id ∪ di) ◦ e) ∩ id;

π i[e] = id − πi[e];e1 ∩ e2 = e1 − (e1 − e2).

Examples

I N(π ) = N(π ,π ).I N(a,−) = N(a,π ,π ,∩,−).

Page 15: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

10/36

Part II

On Tarski’s Relation Algebra

QUERYING TREES AND CHAINS

Page 16: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

11/36

Why studying trees

DefinitionA tree is an acyclic graph in which exactly one node, the root, has noincoming edges, and all other nodes have exactly one incoming edge.

AdviserOf AdviserOf

AdviserOf AdviserOf

I Hierarchical relations: taxonomies, corporate structures.I XML and JSON data.I Nested relational data.

Page 17: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

12/36

Initial classification

1-subtreereducible

2-subtreereducible

3-subtreereducible

downward non-downward

local

non-local,∗-free

non-∗-free

monotone

non-monotone

monotone

non-monotone

monotone

non-monotone

N(∩,−)

N(π ,∩)

N(π ,π ,∩,−)

N(π ,∩, ∗)

N(π ,π ,∩,−, ∗)

N(a,π ,π ,∩)

N(a,π ,∩)

N(a,π ,π ,∩, ∗)

N(a,π ,∩, ∗)

N(di, a,π )

N(di, a,π ,π )

N(di, a,π , ∗)

N(di, a,π ,π , ∗)

N(a,π ,π ,∩,−)

N(di, a,π ,∩)

N(di, a,π ,π ,∩,−)

N(di, a,π ,∩, ∗)

N

Page 18: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

13/36

Recognizing branches: labeled trees

FriendOf ParentOf FriendOf ParentOf

Only select the three on the right

I Not possible in N(∗).I Using projection: π1[FriendOf ] ◦ π1[ParentOf ].I Using converse: FriendOf a ◦ ParentOf .

Page 19: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

14/36

Recognizing branches: unlabeled trees

DefinitionA k-subtree-reduction step on tree T consists of removing a subtreerooted at a node n with parent m, given that parent m has at least kother children isomorphic to n.

Theorem

1. N(di,∩) and N(a,−) are closed under 3-subtree-reductions.

2. N(di, a,π ,π , ∗) is closed under 2-subtree-reductions.

3. N(a,π ,π ,∩, ∗) is closed under 1-subtree-reductions.

Page 20: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

15/36

Monotonicity

I Without negation you cannot ‘forbid’ structures.I π always provides negation: NonParent = π 1[ParentOf ].I Without negation, only a few Boolean queries are possible.

Theorem

1. On unlabeled chains, we have N(di, a,π ,∩, ∗) �bool N().

2. On unlabeled trees, we have N(a,π ,∩, ∗) �bool N().

TheoremAlready on unlabeled chains, we have N(π ) �bool N(di, a,π ,∩, ∗).

Page 21: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

16/36

The Kleene-star

I Is not expressible in first-order logic.I Path queries: always adds expressive power.I Labeled structures: always adds expressive power.

Theorem

1. On unlabeled chains, we have N(di, a,π ,∩, ∗) �bool N().

2. On unlabeled trees, we have N(a,π ,∩, ∗) �bool N().

TheoremLet F ⊆ {di, a,π }. On unlabeled trees, we haveN(F ∪ {∗}) �bool N(F).

Page 22: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

17/36

Downward queries: path semantics

N()

N(π )

N(π )

N(∗)

N(π , ∗)

N(π , ∗)

N(∩,−)

N(π ,∩)

N(π ,π ,∩,−)

N(∩,−, ∗)

N(π ,∩, ∗)

N(π ,π ,∩,−, ∗)

I N(π ,π ,∩,−, ∗) is downward.I a and di are not downward.I Downward queries can be expressed

by condition automata.

TheoremLet F ⊆ {π ,π ,∩,−, ∗}. On labeled trees,we have N(F) �path N(F − {∩,−}).

Page 23: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

18/36

Downward queries: Boolean semantics

Unlabeled Trees Labeled Trees

N()

N(π )

N(π , ∗)

N(∩,−, ∗)

N(π ,∩, ∗)

N(π ,π ,∩,−)

N(π ,π ,∩,−, ∗)

N()

N(π )

N(π )

N(∗)

N(π , ∗)

N(π , ∗)

N(∩,−)

N(π ,∩)

N(π ,π ,∩,−)

N(∩,−, ∗)

N(π ,∩, ∗)

N(π ,π ,∩,−, ∗)

Page 24: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

19/36

Local queries: path semantics

N()

N(π )

N(π )

N(a)

N(a,π )

N(a,π )

N(a,−)

N(∩,−)

N(π ,∩)

N(π ,π ,∩,−)

N(a,π ,∩)

N(a,π ,π ,∩)

N(a,π ,π ,∩,−)

I N(a,π ,π ,∩,−) is local.I ∗ and di are not local.I Local queries can be expressed by

unions of condition tree queries.

TheoremLet {a,π } ⊆ F ⊆ {a,π ,π ,∩}. On labeledtrees, we have N(F) �path N(F − {∩}).

Page 25: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

20/36

Local queries: Boolean semantics

Unlabeled Trees Labeled Trees

N()

N(π )

N(a,−)

N(∩,−)

N(a,π ,∩)

N(π ,π ,∩,−)N(a,π ,π ,∩)

N(a,π ,π ,∩,−)

N()

N(π ) = N(a)

N(π )

N(a,−)

N(∩,−)

N(π ,∩)N(a,π ,∩)

N(π ,π ,∩,−)N(a,π ,π ,∩)

N(a,π ,π ,∩,−)

Page 26: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

21/36

Conclusion

I We fully established relationships between:I downward fragments;I local fragments.

I On trees: solved most cases not involving ∗.I On chains: solved most cases not involving di or ∗.

Main open problemLet {di,π ,∩} ⊆ F ⊆ {di, a,π ,π ,∩, ∗} and let z ∈ {bool, path}. Withrespect to either labeled trees, unlabeled trees, labeled chains, orunlabeled chains, do we have a collapse N(F ∪ {−}) �z N(F) or not?

Page 27: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

22/36

Part III

On Tarski’s Relation Algebra

THE SEMI-JOIN ALGEBRA

Page 28: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

23/36

Naive query evaluation: an ine�icient example

Return pairs of (great-grandparent, friend)π1[ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf .

1. Compute (grandparent, grandchild):X = ParentOf ◦ ParentOf

2. Compute (great-grandparent, great-grandchild):Y = ParentOf ◦ X

3. Throw away the great-grandchildren:Z = π1[Y ]

4. Compute (great-grandparent, friend):Result = Z ◦ FriendOf

Page 29: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

24/36

Introducing the semi-join algebra

mi

...

m2

m1

z

A

A

A

njB

...

n2

B

n1

B

Composition: querying for pathsConsider A ◦ B:I yields (begin, end)-nodes connected by AB.I yields i · j node pairs in total.

Checking for existence of pathsConsider the semi-join A n B:I yields the edges in A that connect to B.I yields i node pairs in total.

Page 30: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

24/36

Introducing the semi-join algebra

mi

...

m2

m1

z

A

A

A

njB

...

n2

B

n1

B

Composition: querying for pathsConsider A ◦ B:I yields (begin, end)-nodes connected by AB.I yields i · j node pairs in total.

Checking for existence of pathsConsider the semi-join A n B:I yields the edges in A that connect to B.I yields i node pairs in total.

Page 31: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

25/36

Optimize query evaluation: using semi-joins

Return pairs of (great-grandparent, friend)π1[ParentOf ◦ ParentOf ◦ ParentOf ] ◦ FriendOf .

1. Compute (grandparent, ???):X = ParentOf n ParentOf

2. Compute (great-grandparent, ???):Y = ParentOf n (X )

3. Throw away ???:Z = π1[Y ]

4. Compute (great-grandparent, friend):Result = Z o FriendOf

π1[ParentOf n (ParentOf n ParentOf )] o FriendOf .

Page 32: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

26/36

Projection-equivalence

Requiring the rewrite to guarantee path-equivalence is too strong!

Le�-projection-equivalence. We only need the first projection.

ParentOf ◦ ParentOf ≡π1 ParentOf n ParentOf

π1[ParentOf ◦ ParentOf ] ≡path π1[ParentOf n ParentOf ]

Right-projection-equivalence. We only need the second projection.

ParentOf ◦ ParentOf ≡π2 ParentOf o ParentOf

π2[ParentOf ◦ ParentOf ] ≡path π2[ParentOf o ParentOf ]

Page 33: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

27/36

The main result

∗ ∅ id ∪ ◦ a π π di ∩ −Relation algebra

fp ∅ id ∪no

a π π di ∩ −Semi-join algebra

FO[2]

FO[3]

�path�π2�π1

Intersection and di�erence

I Consider (E ◦ E) ∩ E:

G3,3: G4:

I We can allow ∩ and − on basic expressions (edges, ∅, id, di).

Page 34: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

27/36

The main result

∗ ∅ id ∪ ◦ a π π di ∩ −Relation algebra

fp ∅ id ∪no

a π π di ∩ −Semi-join algebra

FO[2]

FO[3]

�path�π2�π1

Intersection and di�erence

I Consider (E ◦ E) ∩ E:

G3,3: G4:

I We can allow ∩ and − on basic expressions (edges, ∅, id, di).

Page 35: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

28/36

Partial rewriting to the semi-join algebra

Rewrite functions τ (e) ≡path e, τπ1(e) ≡π1 e and τπ2(e) ≡π2 e

Helper functions τ◦1(e; ε) ≡π1 e n ε and τ◦2(e; ε) ≡π2 ε o e

Examplee = π1[((WorksOn ◦WorksOna) ∩ FriendOf ) ◦ EditorOf ] ◦ StudentOf

Rewriting e using τ (e):

τ (e) = τπ2(π1[((W ◦Wa) ∩ F ) ◦ E]) o τ (S)

= π1[τπ1(((W ◦Wa) ∩ F ) ◦ E)] o S

= π1[τ◦1((W ◦Wa) ∩ F ; E)] o S

= π1[(τ (W ◦Wa) ∩ τ (F )) n E o S]

= π1[((τ (W ) ◦ τ (Wa)) ∩ F ) n E o S]

= π1[((W ◦Wa) ∩ F ) n E] o S.

Page 36: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

29/36

�ery rewriting and optimization

I Cost of each operator 7.I Input size of each operator 7.I Number of evaluation steps 7.

Cost of each operator

Relation algebra

Semi-join algebra

FO[2]

FO[3]

∗ ∅ id ∪ ◦ a π π di ∩ −

fp ∅ id ∪no

a π π di ∩ −

Page 37: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

29/36

�ery rewriting and optimization

I Cost of each operator 3.I Input size of each operator 7.I Number of evaluation steps 7.

Cost of each operator

Relation algebra

Semi-join algebra

FO[2]

FO[3]

∗ ∅ id ∪ ◦ a π π di ∩ −

fp ∅ id ∪no

a π π di ∩ −

Page 38: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

30/36

�ery rewriting and optimization

I Cost of each operator 3.I Input size of each operator 7.I Number of evaluation steps 7.

Input size of each operator

m n

ziA B

...

z2

A B

z1

A B

A ◦ B ≡π1 A n B

I A ◦ B yields {(m, n)}.I A n B yields {(m, z1), (m, z2), . . . , (m, zi)}.

I Fix: add projection-steps into algorithms.

Page 39: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

30/36

�ery rewriting and optimization

I Cost of each operator 3.I Input size of each operator 3.I Number of evaluation steps 7.

Input size of each operator

m n

ziA B

...

z2

A B

z1

A B

A ◦ B ≡π1 A n B

I A ◦ B yields {(m, n)}.I A n B yields {(m, z1), (m, z2), . . . , (m, zi)}.I Fix: add projection-steps into algorithms.

Page 40: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

31/36

�ery rewriting and optimization

I Cost of each operator 3.I Input size of each operator 3.I Number of evaluation steps ∼.

Number of evaluation steps

I Number of evaluation steps can increase.I Complexity of each individual step can significantly decrease.

ExampleConsider e = π1[(` ◦ `) ◦ (` ◦ `)] and τ (e) = π1[` n (` n (` n `)))].We can evaluate e in three steps:

1. compute X = ` ◦ `;

2. compute Y = X ◦ X ;

3. compute π1[Y ].

Page 41: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

32/36

Other expensive constructs: id, di, and π

Idea: replace common use cases

I Replace id by equality-selections

(FriendOf ◦WorksWith) ∩ id ≡path σ=(FriendOf ◦WorksWith).

I Replace di by inequality-selections

(FriendOf ◦ FriendOf ) ∩ di ≡path σ,(FriendOf ◦ FriendOf ).

I Replace π by anti-semi-joins

FriendOf ◦ π 1[ParentOf ] ≡path FriendOf n̄ ParentOf .

Page 42: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

33/36

Further optimization: filter steps in practical queries

Find friend suggestions for Alice:

1. Compute all friends-of-friends (excluding friends and oneself):

FriendSuggestions = (FriendOf ◦ FriendOf ) − (FriendOf ∪ id).

2. Filter the first column on Alice.

3. Keep the second column.

Incorporate filter-step in the relation algebra

π2[(〈Alice〉 o FriendOf ) o (FriendOf n̄

π2[(〈Alice〉 o FriendOf ) ∪ 〈Alice〉])]

Page 43: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

33/36

Further optimization: filter steps in practical queries

Find friend suggestions for Alice:

1. Compute all friends-of-friends (excluding friends and oneself):

FriendSuggestions = (FriendOf ◦ FriendOf ) − (FriendOf ∪ id).

2. Filter the first column on Alice.

3. Keep the second column.

Incorporate filter-step in the relation algebra

π2[(〈Alice〉 o FriendOf ) o (FriendOf n̄

π2[(〈Alice〉 o FriendOf ) ∪ 〈Alice〉])]

Page 44: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

34/36

Conclusion

�eries in the relation algebra can be partially rewri�en intosemi-join algebra queries that are easier to evaluate.

Future work

I Implementation in real systems.I Interactions with other optimization techniques.I Elimination of intersection and di�erence.

Page 45: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

35/36

Part IV

On Tarski’s Relation Algebra

CONCLUSION

Page 46: QUERYING TREES AND CHAINSjhellings.nl/files/phd_defense_slides.pdfPeggy FriendOf FriendOf Victor FriendOf WorksWith Wendy FriendOf FriendsOfGgp = π 1»ParentOf ParentOf ParentOf…

36/36

Conclusion

�estions

I Are all these operators necessary?I What does each operator add?I When can we replace complex operators by simpler ones?

This workImprove understanding of the relation algebraI when querying trees or chains;I in comparison with the semi-join algebra.