Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07]...

204
Queries on Trees erˆ ome Champav` ere Emmanuel Filiot Olivier Gauwin ´ Edouard Gilbert Slawek Staworko INRIA Lille, Mostrare 2008 PhDs+S lawek (Mostrare) Queries on Trees 2008 1 / 128

Transcript of Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07]...

Page 1: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Queries on Trees

Jerome Champavere Emmanuel Filiot Olivier GauwinEdouard Gilbert S lawek Staworko

INRIA Lille, Mostrare

2008

PhDs+S lawek (Mostrare) Queries on Trees 2008 1 / 128

Page 2: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Framework

n-ary queries on unranked labeled finite ordered trees

PhDs+S lawek (Mostrare) Queries on Trees 2008 2 / 128

Page 3: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Treest = a

c c d

finite alphabet: Σ = {a, b, c , d , e}

t is the structure (D, ch∗, ns∗, label) with:

D = {ǫ, 1, 2, 3}: prefix-closed finite subset of N

ch∗ = reflexive-transitive closure of ch, defined by:

ch(π1, π2) ⇔ π2 = π1 ·i for some i ∈ N

ns∗ = reflexive-transitive closure of ns, defined by:

ns(π1, π2) ⇔ π1 = π ·i and π2 = π ·(i + 1) for some π, i ∈ N∗ × N

label : D→ Σ. Can also be seen as a partition (labela)a∈Σ of D.

label(1·3) = d labeld(1·3)

PhDs+S lawek (Mostrare) Queries on Trees 2008 3 / 128

Page 4: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Queries

n-ary queries

q(t) ⊆ Dnt

n=0: Boolean queries

q(t) = ∅ or q(t) = {()}

q defines Lq = {t | q(t) = {()}}

Questions

expressivity

complexity of:◮ model-checking: x ∈ q(t)◮ satisfiability: ∃t, q(t) 6= ∅

PhDs+S lawek (Mostrare) Queries on Trees 2008 4 / 128

Page 5: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Existing material

Surveys

Logics over unranked trees: an overview [Lib06]

Automata, logic, and XML [Nev02b, Nev02a]

Automata for XML – a survey [Sch07]

Effective Characterizations of Tree Logics [Boj08a]

Tree-walking automata [Boj08b]

Books

Finite Model Theory [EF99, Lib04]

Foundations of Databases [AHV95]

PhDs+S lawek (Mostrare) Queries on Trees 2008 5 / 128

Page 6: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

1 Classical logics (FO, MSO)

2 Queries by Tree AutomataTree-walking automataSchema Languages & Tree Automata

3 Conjunctive Queries over TreesDefinition, results and acyclic fragmentTwigs and Tree Patterns

4 Monadic Datalog

5 µ-calculus

6 XPath

7 Temporal Logics

PhDs+S lawek (Mostrare) Queries on Trees 2008 6 / 128

Page 7: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Part I

Classical Logics, Automata

PhDs+S lawek (Mostrare) Queries on Trees 2008 7 / 128

Page 8: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

1 Classical logics (FO, MSO)

2 Queries by Tree AutomataTree-walking automataSchema Languages & Tree Automata

PhDs+S lawek (Mostrare) Queries on Trees 2008 8 / 128

Page 9: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO

Well-formed formulas based on:

predicates from the structure: ch∗, ns∗, (labela)a∈Σ

Boolean connectives: ∧,¬

FO variables: x , y ...

quantifiers on FO variables: ∃x

PhDs+S lawek (Mostrare) Queries on Trees 2008 9 / 128

Page 10: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Querying using FO

We use free variables:

q(x) = ∃y .∃z .(ch∗(x , y) ∧ ch∗(y , z) ∧ labela(z))

This way we can define queries of any arity.

PhDs+S lawek (Mostrare) Queries on Trees 2008 10 / 128

Page 11: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Available predicates

Why ch∗ and ns∗?

because ch and ns are definable from ch∗ and ns∗ in FO...

... but the converse is false

So in the following, we suppose that ch and ns are also available.

Also definable in FO:

unary predicates: root , leaf , lc (lastchild)

binary predicates: fc (firstchild)

PhDs+S lawek (Mostrare) Queries on Trees 2008 11 / 128

Page 12: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Complexity

Model-checking

PSPACE-complete (combined complexity). [Sto74, Var82]

Remark: PSPACE-hardness is even true for the quantifiedpropositional logic [GO99].

Satisfiability

non-elementary on trees

PhDs+S lawek (Mostrare) Queries on Trees 2008 12 / 128

Page 13: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Restrictions on the number of variables

FOk = FO formulas using only k variables

Variables might be reused

q(x) = ∃y .∃z .(ch∗(x , y) ∧ ch∗(y , z) ∧ labela(z)) /∈ FO2

but is equivalent toq′(x) = ∃y .(ch∗(x , y) ∧ ∃x .(ch∗(y , x) ∧ labela(x))) ∈ FO2

Theorem ([Imm82, Var95, GO99])

The model-checking problem for FOk (with k ≥ 2) is P-complete on anystructure.

PhDs+S lawek (Mostrare) Queries on Trees 2008 13 / 128

Page 14: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Restrictions on the number of variables

FO2 = FO formulas using only 2 variables

In FO2, one cannot define ch and ns from ch∗ and ns∗ anymore. So ch

and ns are added to the signature.

Complexity

Model-checking in FO2 can be done in O(|t|2.|q|) [Imm82].

Expressivity

FO is strictly more expressive than FO2.

example of Boolean query: trees where the leaf language is (ab)∗.

Links between FO2 and XPath will be shown in Part 3.

PhDs+S lawek (Mostrare) Queries on Trees 2008 14 / 128

Page 15: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

ExpressivityA B A ( B

A B A ⊆ B

A B A * B

FO

FO2

PhDs+S lawek (Mostrare) Queries on Trees 2008 15 / 128

Page 16: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Restrictions on the number of variables

Data values

predicate ∼:x ∼ y if x and y are two attribute nodes that have the same value

in XPath semantics: add tests of the form/bib//book/@type = //collection/@style

Decidability

FO2 [∼,ch,ns ] is decidable [BDM+06].

FO2 [∼,ch,ns,ch∗,ns∗ ]: open question

FO3[∼,ch,ns ] is undecidable (even on strings) [BMS+06].

PhDs+S lawek (Mostrare) Queries on Trees 2008 16 / 128

Page 17: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Restrictions on the number of variables

FOkn = FO formulas using (k bound variables) + (n free variables)

We assume here that the n free variables are never quantified.

Some results on trees

FO2 = FO32 [Mar05a] (his result is stronger)

FO3 = FO33 [Mar05b]

FOn = FO3n

◮ translate into a FO0 formula on alphabet Σ× Bn,◮ FO0 = FO3

0 (consequence of [Mar05b], Th. 3)◮ backward translation: label(f ,~b)(x) becomes labelf (x)

~bi =1 x = xi

PhDs+S lawek (Mostrare) Queries on Trees 2008 17 / 128

Page 18: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

MSO

MSO = FO + quantification over monadic predicates

“monadic predicates” also seen as “sets”X (x) x ∈ X

φodd(x , y)

= “y is a descendant of x and the path between them is of odd length”= ∃X .∃Y . (∀z .(X (z)⇔ ¬Y (z))) ∧

(∀z .(X (z) ∨ Y (z)⇒ ch∗(x , z) ∧ ch∗(z , y))) ∧(X (x) ∧ Y (y)) ∧(∀z .∀v .(ch∗(x , z) ∧ ch(z , v) ∧ ch∗(v , y)⇒

(X (z)⇒ Y (v) ∧ Y (z)⇒ X (v))))

Expressivity

MSO is strictly more expressive than FO (see φodd).

PhDs+S lawek (Mostrare) Queries on Trees 2008 18 / 128

Page 19: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

ExpressivityA B A ( B

A B A ⊆ B

A B A * B

MSO

FO

FO2

PhDs+S lawek (Mostrare) Queries on Trees 2008 19 / 128

Page 20: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

MSO: Complexity

Model-checking

combined complexity: PSPACEc [Sto74, Var82]

data complexity: linear (by translation to automaton)

Satisfiability

non-elementary on trees

PhDs+S lawek (Mostrare) Queries on Trees 2008 20 / 128

Page 21: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Deciding membership to FO

Theorem ([BS05])

Given a regular tree language L, one can decide if L is definable inFOch,(labela)a∈Σ

.

Open decision problem

Given a regular tree language L, is it possible to decide if L is definable inFO?

In other words, FO-definability is known to be decidable for unorderedtrees, but unknown for ordered trees.

Automata for FO

For a definition of automata recognizing exactly FO-definable languages,see [Boj04, Chapter 2].

PhDs+S lawek (Mostrare) Queries on Trees 2008 21 / 128

Page 22: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

1 Classical logics (FO, MSO)

2 Queries by Tree AutomataTree-walking automataSchema Languages & Tree Automata

PhDs+S lawek (Mostrare) Queries on Trees 2008 22 / 128

Page 23: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree Automata for Queries

Branching & Stepwise Tree automata

Query automata

Tree-walking automata (TWA)

Schema languages

PhDs+S lawek (Mostrare) Queries on Trees 2008 23 / 128

Page 24: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Branching & Stepwise Tree Automata I

Automata over Σ× {0, 1}n

◮ Canonical languages◮ Same expressive power as MSO

Automata with selecting states◮ Boolean values into the states◮ Existential run-based queries [NPTT05]◮ Selecting tree automata [FGK03]

Stepwise tree automata [CNT04]

PhDs+S lawek (Mostrare) Queries on Trees 2008 24 / 128

Page 25: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Branching & Stepwise Tree Automata II

Decision problems

Membership PTIME

Non-emptiness PTIME

From MSO to tree automata: non-elementary size◮ Upper bound [TW68]◮ Lower bound [FG02]

PhDs+S lawek (Mostrare) Queries on Trees 2008 25 / 128

Page 26: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Query Automata [NS99, NS02]

Two-way deterministic tree automata [Mor94] over (un)ranked treesextended with a selection function

Equivalent to MSO

Decision problems

Non-emptiness EXPTIME

Containment EXPTIME

Equivalence EXPTIME

PhDs+S lawek (Mostrare) Queries on Trees 2008 26 / 128

Page 27: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

1 Classical logics (FO, MSO)

2 Queries by Tree AutomataTree-walking automataSchema Languages & Tree Automata

PhDs+S lawek (Mostrare) Queries on Trees 2008 27 / 128

Page 28: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Context

Most work is done on ranked trees

PhDs+S lawek (Mostrare) Queries on Trees 2008 28 / 128

Page 29: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Context

Most work is done on ranked trees

Still some definitions on unranked cases, but few results

PhDs+S lawek (Mostrare) Queries on Trees 2008 28 / 128

Page 30: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Context

Most work is done on ranked trees

Still some definitions on unranked cases, but few results

Thus we will work on trees of rank 2

PhDs+S lawek (Mostrare) Queries on Trees 2008 28 / 128

Page 31: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Context

Most work is done on ranked trees

Still some definitions on unranked cases, but few results

Thus we will work on trees of rank 2

Structure: label, ch1, ch2

PhDs+S lawek (Mostrare) Queries on Trees 2008 28 / 128

Page 32: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree-walking automataon ranked trees

A tree-walking automaton (TWA)[AU71]:

◮ a tree is accepted whenever the “accept” action is used

PhDs+S lawek (Mostrare) Queries on Trees 2008 29 / 128

Page 33: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree-walking automataon ranked trees

A tree-walking automaton (TWA)[AU71]:◮ the automaton is located in some node (at first, the root) of the tree

and in a given state

◮ a tree is accepted whenever the “accept” action is used

PhDs+S lawek (Mostrare) Queries on Trees 2008 29 / 128

Page 34: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree-walking automataon ranked trees

A tree-walking automaton (TWA)[AU71]:◮ the automaton is located in some node (at first, the root) of the tree

and in a given state◮ if some conditions are verified (label, being a leaf, being the left child

of one’s parent), decide of an action

◮ a tree is accepted whenever the “accept” action is used

PhDs+S lawek (Mostrare) Queries on Trees 2008 29 / 128

Page 35: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree-walking automataon ranked trees

A tree-walking automaton (TWA)[AU71]:◮ the automaton is located in some node (at first, the root) of the tree

and in a given state◮ if some conditions are verified (label, being a leaf, being the left child

of one’s parent), decide of an action◮ actions: accept, reject, move to parent with state q, move to left child

with state q′, . . .◮ a tree is accepted whenever the “accept” action is used

PhDs+S lawek (Mostrare) Queries on Trees 2008 29 / 128

Page 36: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree-walking automataon ranked trees

A tree-walking automaton (TWA)[AU71]:◮ the automaton is located in some node (at first, the root) of the tree

and in a given state◮ if some conditions are verified (label, being a leaf, being the left child

of one’s parent), decide of an action◮ actions: accept, reject, move to parent with state q, move to left child

with state q′, . . .◮ a tree is accepted whenever the “accept” action is used

PhDs+S lawek (Mostrare) Queries on Trees 2008 29 / 128

Page 37: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree-walking automataon ranked trees

A tree-walking automaton (TWA)[AU71]:◮ the automaton is located in some node (at first, the root) of the tree

and in a given state◮ if some conditions are verified (label, being a leaf, being the left child

of one’s parent), decide of an action◮ actions: accept, reject, move to parent with state q, move to left child

with state q′, . . .◮ a tree is accepted whenever the “accept” action is used a tree can be

rejected by looping, the “reject” action is not necessary

Expressiveness:

PhDs+S lawek (Mostrare) Queries on Trees 2008 29 / 128

Page 38: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree-walking automataon ranked trees

A tree-walking automaton (TWA)[AU71]:◮ the automaton is located in some node (at first, the root) of the tree

and in a given state◮ if some conditions are verified (label, being a leaf, being the left child

of one’s parent), decide of an action◮ actions: accept, reject, move to parent with state q, move to left child

with state q′, . . .◮ a tree is accepted whenever the “accept” action is used a tree can be

rejected by looping, the “reject” action is not necessary

Expressiveness:◮ any tree-walking automaton can be represented as a branching

automaton, but with exponential blowup

PhDs+S lawek (Mostrare) Queries on Trees 2008 29 / 128

Page 39: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree-walking automataon ranked trees

A tree-walking automaton (TWA)[AU71]:◮ the automaton is located in some node (at first, the root) of the tree

and in a given state◮ if some conditions are verified (label, being a leaf, being the left child

of one’s parent), decide of an action◮ actions: accept, reject, move to parent with state q, move to left child

with state q′, . . .◮ a tree is accepted whenever the “accept” action is used a tree can be

rejected by looping, the “reject” action is not necessary

Expressiveness:◮ any tree-walking automaton can be represented as a branching

automaton, but with exponential blowup◮ but the opposite is false: TWA are not as expressive as MSO [BC05]

PhDs+S lawek (Mostrare) Queries on Trees 2008 29 / 128

Page 40: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressiveness (ranked case)A B A ( B

A B A ⊆ B

A B A * B MSO

TWA

FO

[BC04]

PhDs+S lawek (Mostrare) Queries on Trees 2008 30 / 128

Page 41: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Deterministic TWA and their expressiveness

Formulae recognised by deterministic TWA are stable by negation

Formulae recognised by non-deterministic ones are not

PhDs+S lawek (Mostrare) Queries on Trees 2008 31 / 128

Page 42: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Deterministic TWA and their expressiveness

Formulae recognised by deterministic TWA are stable by negation

Formulae recognised by non-deterministic ones are not

PhDs+S lawek (Mostrare) Queries on Trees 2008 31 / 128

Page 43: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Deterministic TWA and their expressiveness

Formulae recognised by deterministic TWA are stable by negation

Formulae recognised by non-deterministic ones are not ⇒deterministic TWA are strictly less expressive than non-deterministicones [MSS06]

FO ⊆ TWA ( MSO [BC04]

PhDs+S lawek (Mostrare) Queries on Trees 2008 31 / 128

Page 44: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Deterministic TWA and their expressiveness

Formulae recognised by deterministic TWA are stable by negation

Formulae recognised by non-deterministic ones are not ⇒deterministic TWA are strictly less expressive than non-deterministicones [MSS06]

FO ⊆ TWA ( MSO [BC04]

FO 6⊆ DTWA 6⊆ FO [BC05]

PhDs+S lawek (Mostrare) Queries on Trees 2008 31 / 128

Page 45: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressiveness (ranked case)A B A ( B

A B A ⊆ B

A B A * B MSO

TWA

detTWA

FO

[MSS06]

[BC05]

[BC04]

[BC05]

PhDs+S lawek (Mostrare) Queries on Trees 2008 32 / 128

Page 46: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Pebble tree-walking automata and stack discipline[EH99]

Add a finite number of pebble marked {1, . . . , n} to the automaton

PhDs+S lawek (Mostrare) Queries on Trees 2008 33 / 128

Page 47: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Pebble tree-walking automata and stack discipline[EH99]

Add a finite number of pebble marked {1, . . . , n} to the automaton

New tests: is there a pebble on current node?

PhDs+S lawek (Mostrare) Queries on Trees 2008 33 / 128

Page 48: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Pebble tree-walking automata and stack discipline[EH99]

Add a finite number of pebble marked {1, . . . , n} to the automaton

New tests: is there a pebble on current node?

New actions: add a pebble to current position, remove a pebble fromthe current (or any) state

PhDs+S lawek (Mostrare) Queries on Trees 2008 33 / 128

Page 49: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Pebble tree-walking automata and stack discipline[EH99]

Add a finite number of pebble marked {1, . . . , n} to the automaton

New tests: is there a pebble on current node?

New actions: add a pebble to current position, remove a pebble fromthe current (or any) state

Stack discipline: if pebble 1 to i already can only add pebble i + 1 orremove pebble i

PhDs+S lawek (Mostrare) Queries on Trees 2008 33 / 128

Page 50: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressiveness of pebble TWA

Expressiveness increases with number of pebble [BSSS06]

∀n ∈ N PTWAn ( PTWAn+1

detPTWA ⊆ PTWA[EH99]

it is not known if detPTWA = PTWAbut there is no c s.t. PTWAk ⊆ detPTWAck

Expressiveness without stack discipline

MSO 6⊆ TWAno stack

TWAno stack emptiness is undecidable

PhDs+S lawek (Mostrare) Queries on Trees 2008 34 / 128

Page 51: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressiveness (ranked case)A B A ( B

A B A ⊆ B

A B A * B MSO

TWA

detTWA

FO

TWApebble

detTWApebble

[BC04]

[BC05]

[BC04]

[BSSS06]

[BC05]

PhDs+S lawek (Mostrare) Queries on Trees 2008 35 / 128

Page 52: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Unbounded pebble TWA

We now allow an unbounded number of pebble (with stack discipline)

Expressiveness

Unbounded pebble TWA emptiness is undecidableInvisible pebble TWA = MSO

PhDs+S lawek (Mostrare) Queries on Trees 2008 36 / 128

Page 53: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Unbounded pebble TWA

We now allow an unbounded number of pebble (with stack discipline)

We can consider invisible pebble: only the top pebble presence can betested[EHS07]

Expressiveness

Unbounded pebble TWA emptiness is undecidableInvisible pebble TWA = MSO

PhDs+S lawek (Mostrare) Queries on Trees 2008 36 / 128

Page 54: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Alternating tree-walking automata

Two players ∀,∃

PhDs+S lawek (Mostrare) Queries on Trees 2008 37 / 128

Page 55: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Alternating tree-walking automata

Two players ∀,∃

Each state belongs to a player: Q = Q∀ ⊎ Q∃

PhDs+S lawek (Mostrare) Queries on Trees 2008 37 / 128

Page 56: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Alternating tree-walking automata

Two players ∀,∃

Each state belongs to a player: Q = Q∀ ⊎ Q∃

If q ∈ Q∀, then ∀ plays the next move in a given set of rules,otherwise, ∃ does

PhDs+S lawek (Mostrare) Queries on Trees 2008 37 / 128

Page 57: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Alternating tree-walking automata

Two players ∀,∃

Each state belongs to a player: Q = Q∀ ⊎ Q∃

If q ∈ Q∀, then ∀ plays the next move in a given set of rules,otherwise, ∃ does

A tree is accepted if ∃ wins, rejected if ∀ does

PhDs+S lawek (Mostrare) Queries on Trees 2008 37 / 128

Page 58: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Alternating tree-walking automata

Two players ∀,∃

Each state belongs to a player: Q = Q∀ ⊎ Q∃

If q ∈ Q∀, then ∀ plays the next move in a given set of rules,otherwise, ∃ does

A tree is accepted if ∃ wins, rejected if ∀ does

∃ wins if an accept rule is played by someone or if ∀ has no possiblemove, otherwise ∀ wins

PhDs+S lawek (Mostrare) Queries on Trees 2008 37 / 128

Page 59: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Alternating tree-walking automata

Two players ∀,∃

Each state belongs to a player: Q = Q∀ ⊎ Q∃

If q ∈ Q∀, then ∀ plays the next move in a given set of rules,otherwise, ∃ does

A tree is accepted if ∃ wins, rejected if ∀ does

∃ wins if an accept rule is played by someone or if ∀ has no possiblemove, otherwise ∀ wins

PhDs+S lawek (Mostrare) Queries on Trees 2008 37 / 128

Page 60: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Alternating tree-walking automata

Two players ∀,∃

Each state belongs to a player: Q = Q∀ ⊎ Q∃

If q ∈ Q∀, then ∀ plays the next move in a given set of rules,otherwise, ∃ does

A tree is accepted if ∃ wins, rejected if ∀ does

∃ wins if an accept rule is played by someone or if ∀ has no possiblemove, otherwise ∀ wins

Expressiveness

alternating TWA = MSO

PhDs+S lawek (Mostrare) Queries on Trees 2008 37 / 128

Page 61: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Caterpillar expressions [BKW00]

PhDs+S lawek (Mostrare) Queries on Trees 2008 38 / 128

Page 62: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Caterpillar expressions [BKW00]

Caterpillar expressions describe runs of tree-walking automata

PhDs+S lawek (Mostrare) Queries on Trees 2008 39 / 128

Page 63: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Caterpillar expressions [BKW00]

Caterpillar expressions describe runs of tree-walking automata

Caterpillar alphabet on Σ◮ Commands letter goleft, goright and goparent

PhDs+S lawek (Mostrare) Queries on Trees 2008 39 / 128

Page 64: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Caterpillar expressions [BKW00]

Caterpillar expressions describe runs of tree-walking automata

Caterpillar alphabet on Σ◮ Commands letter goleft, goright and goparent◮ Tests letter leaf, isleft, isright and labels a ∈ Σ

PhDs+S lawek (Mostrare) Queries on Trees 2008 39 / 128

Page 65: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Caterpillar expressions [BKW00]

Caterpillar expressions describe runs of tree-walking automata

Caterpillar alphabet on Σ◮ Commands letter goleft, goright and goparent◮ Tests letter leaf, isleft, isright and labels a ∈ Σ

Caterpillar words describe paths in TWA: isleft a goleft bdescribes paths going from a left child labeled a to its left childlabeled b

PhDs+S lawek (Mostrare) Queries on Trees 2008 39 / 128

Page 66: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Caterpillar expressions [BKW00]

Caterpillar expressions describe runs of tree-walking automata

Caterpillar alphabet on Σ◮ Commands letter goleft, goright and goparent◮ Tests letter leaf, isleft, isright and labels a ∈ Σ

Caterpillar words describe paths in TWA: isleft a goleft bdescribes paths going from a left child labeled a to its left childlabeled b

Caterpillar expressions: regular expressions on caterpillar alphabet

PhDs+S lawek (Mostrare) Queries on Trees 2008 39 / 128

Page 67: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Cutting caterpillar expressions

New letters:◮ test 〈c〉 (nest) where c is a caterpillar expression: true if c applied to

current node selects at least one path

PhDs+S lawek (Mostrare) Queries on Trees 2008 40 / 128

Page 68: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Cutting caterpillar expressions

New letters:◮ test 〈c〉 (nest) where c is a caterpillar expression: true if c applied to

current node selects at least one path◮ command cut transform the whole tree into the subtree of the current

node — local transform, does not apply outside nests

PhDs+S lawek (Mostrare) Queries on Trees 2008 40 / 128

Page 69: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Cutting caterpillar expressions

New letters:◮ test 〈c〉 (nest) where c is a caterpillar expression: true if c applied to

current node selects at least one path◮ command cut transform the whole tree into the subtree of the current

node — local transform, does not apply outside nests

Expressiveness: if nesting is forbidden under scope of negation,posCAT = PTWA

PhDs+S lawek (Mostrare) Queries on Trees 2008 40 / 128

Page 70: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Cutting caterpillar expressions

New letters:◮ test 〈c〉 (nest) where c is a caterpillar expression: true if c applied to

current node selects at least one path◮ command cut transform the whole tree into the subtree of the current

node — local transform, does not apply outside nests

Expressiveness: if nesting is forbidden under scope of negation,posCAT = PTWA

Expressiveness: if nesting is allowed under the scope of a negation, asexpressive as nested TWA (not defined here)

PhDs+S lawek (Mostrare) Queries on Trees 2008 40 / 128

Page 71: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressiveness (ranked case)A B A ( B

A B A ⊆ B

A B A * B MSO

TWA

detTWA

FO

TWApebble

detTWApebble

nested TWA

[BC04]

[BC05]

[BC04]

[BSSS06]

[BC05]

PhDs+S lawek (Mostrare) Queries on Trees 2008 41 / 128

Page 72: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Extensions

Notation: z = (z1, . . . , zn)

Adding Transitive Closure: TC n

TCn[ϕ(x , y)](u, v)

iff

∃k , ∃(wi )i∈[1..k], ϕ(u, w1) ∧ ϕ(w1, w2) ∧ . . . ∧ ϕ(wk , v)

By TCn, we mean “parameter-free” transitive closure, i.e., x and y areexactly the free variable of ϕ.We write TCn

p for the non-parameter-free transitive closure (i.e., ϕ canhave extra free variables).

PhDs+S lawek (Mostrare) Queries on Trees 2008 42 / 128

Page 73: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Extensions

FO + TC 1p = nested TWA [tCS08]

FO + TC 1 is often written FO∗, and FO + TC 1p is written FO(MTC ).

FO + TC 1 ⊆ FO + TC 1p : it is unknown whether it is strict.

FO + TC 1 ⊆ MSO

because TC 1[ϕ(x , y)](u, v)⇔∀X . (u ∈ X ∧ ∀(x , y). (x ∈ X ∧ ϕ(x , y)⇒ y ∈ X )⇒ v ∈ X )

FO ( FO + TC 1 ( MSO

Transitive closure is not expressible in FO [Fag75].

Adding TC 1 to FO is not enough to reach MSO [tCS08].

For properties of FO + TC 1 see [Kep06].

PhDs+S lawek (Mostrare) Queries on Trees 2008 43 / 128

Page 74: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressiveness (ranked case)A B A ( B

A B A ⊆ B

A B A * B MSO

TWA

detTWA

FO

TWApebble

detTWApebble

FO + TC1

FO + TC∗

[BC04]

[BC05]

[BC04]

[BSSS06]

[tCS08]

[TK06]

[BC05]

PhDs+S lawek (Mostrare) Queries on Trees 2008 44 / 128

Page 75: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Extensions

FO + TC 2

FO + TC 2 * MSO

(cf next slide)

MSO ⊆ FO + TC 2?

This is an open question. It could be the case that MSO * FO + TC k , forall k .

PhDs+S lawek (Mostrare) Queries on Trees 2008 45 / 128

Page 76: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Extensions

FO + TC 2 * MSO

For instance L = {f (X , X ) | X ∈ TΣ} is defined by:

ϕ = labelf (ǫ) ∧∃u1.∃v1.fc(ǫ, u1) ∧ ns(u1, v1) ∧ samelabel(u1, v1) ∧¬(∃w .ns(v1,w)) ∧∀u2. ch∗(u1, u2)⇒ ∃v2. TC 2[ψ(x , y)](u1, u2, v1, v2)

where ψ encodes a step isomorphism:ψ(x , y) = samelabel(x2, y2) ∧

(fc(x1, x2) ∧ fc(y1, y2)) ∨ (ns(x1, x2) ∧ ns(y1, y2))

with:samelabel(x , y) =

a∈Σ labela(x) ∧ labela(y)

PhDs+S lawek (Mostrare) Queries on Trees 2008 46 / 128

Page 77: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressiveness (ranked case)A B A ( B

A B A ⊆ B

A B A * B MSO

TWA

detTWA

FO

TWApebble

detTWApebble

FO + TC1

FO + TC2

FO + TC∗

[BC04]

[BC05]

[BC04]

[BSSS06]

[tCS08]

[TK06]

[BC05]

tree isomorphism

PhDs+S lawek (Mostrare) Queries on Trees 2008 47 / 128

Page 78: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Extensions

FO + detTC 1 = detTWApebble [EH06]

Deterministic Transitive Closure of ϕ = TC on the functional part of ϕ

FO + detTC 1 ⊆ FO + TC 1

becausedetTC 1[ϕ(x , y)](u, v)⇔ TC 1[ϕ(x , y) ∧ ∀z .ϕ(x , z)⇒ z = y ](u, v)

FO + detTC 1 ( FO + TC 1?

Open question (see [Kep06]).

For some properties of FO + detTC 1 (linear order, even...) see[Kep06, EI95].

PhDs+S lawek (Mostrare) Queries on Trees 2008 48 / 128

Page 79: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

FO: Extensions

FO + posTC 1 = TWApebble [EH06]

formulas of FO + TC 1 with TC 1 operators under an even number ofnegations

FO + detTC 1 ⊆ FO + posTC 1 ⊆ FO + TC 1

inclusions due to TWA characterisations

whether these 2 inclusions are strict is still open

FO + TC 1 ( MSO

separation language based on the branching structure [tCS08]

PhDs+S lawek (Mostrare) Queries on Trees 2008 49 / 128

Page 80: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressiveness (ranked case)A B A ( B

A B A ⊆ B

A B A * B

FO + TC∗

FO + TC2MSO

FO + TC1 [tCS08]= nestedTWA

FO + posTC1 [EH06]= TWApebble

FO + detTC1 [EH06]= detTWApebble

TWA

detTWA

FO

[TK06]

[BSSS06]

[BSSS06]

[tCS08]

[BC04]

[TK06]

[BC05]

tree isomorphism

[Kep06, BSSS06]

[BC05]

PhDs+S lawek (Mostrare) Queries on Trees 2008 50 / 128

Page 81: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

1 Classical logics (FO, MSO)

2 Queries by Tree AutomataTree-walking automataSchema Languages & Tree Automata

PhDs+S lawek (Mostrare) Queries on Trees 2008 51 / 128

Page 82: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

XML Schema Languages

Describe a set of XML documents

Theoretical framework: no data, only structure

Closer to tree grammars [MLM01] than to tree automata

Tree automata: reference model for the expressiveness

PhDs+S lawek (Mostrare) Queries on Trees 2008 52 / 128

Page 83: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Document Type Definitions (DTDs)“Standard” DTDs

Local tree languages

a

b

c d

L = {

a′

b

d c

, }

a(qb) → qa

a′(qb) → qa′

b(qcqd) → qb

b(qdqc) → qb

c(ǫ) → qc

d(ǫ) → qd

a

b

d c

∈ L!⇒

◮ Restriction: no competiting states

Deterministic content models◮ One-unambiguous regular expressions [BKW98]◮ ab + ac: which a to match depends on the next symbol

Polynomial complexity for other usual decision problems(membership, emptiness, containment), except intersection [MNS04]

Lack of expressivity

PhDs+S lawek (Mostrare) Queries on Trees 2008 53 / 128

Page 84: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Extended DTDs (EDTDs) [MNSB06, Sch07] I

Alphabet extended with types (each type is associated to a uniquesymbol)

a

b1

c d

L = {

a′

b2

d c

, }

a(qb1 ) → qa

a′(qb2 ) → qa′

b1(qcqd) → qb1

b2(qdqc) → qb2

c(ǫ) → qc

d(ǫ) → qd

Typing problem:◮ Valid assignment of types to the elements w.r.t. EDTD◮ (Consistent) combination of unary queries

As expressive as (parallel) unranked tree automata of [BKWM01],thus equivalent to regular tree languages

◮ Examples of such schema languages: Relax NG [CM01], XDuce [HP03]◮ Restricted EDTDs: single-type, restrained-competition

PhDs+S lawek (Mostrare) Queries on Trees 2008 54 / 128

Page 85: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Extended DTDs (EDTDs) [MNSB06, Sch07] II

Single-type EDTDs

a(qb1 + qb2 ) → qa

b1(ǫ) → qb1

b2(ǫ) → qb2

a1(qb1 ) → qa1

a2(qb2 ) → qa2

b1(ǫ) → qb1

b2(ǫ) → qb2

◮ Element Declaration Consistent constraint (W3C XML Schemas)◮ Unique top-down typing◮ Validation with deterministic tree-walking automata

Restrained-competition EDTDs

a(qb1 · qb2 ) → qa

b1(ǫ) → qb1

b2(ǫ) → qb2

◮ Unique top-down left-to-right typing◮ Validation with deterministic top-down tree automata

PhDs+S lawek (Mostrare) Queries on Trees 2008 55 / 128

Page 86: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressiveness of Schemas

EDTD = MSO = UTA

Localtree languages

tree languages(Homogeneous) regular

Path-closedtree languages

EDTDSingle-type

DTD

EDTDRestrained-competition

DTWA

DetTATop-down

PhDs+S lawek (Mostrare) Queries on Trees 2008 56 / 128

Page 87: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Part II

Conjunctive Queries, Monadic Datalog

PhDs+S lawek (Mostrare) Queries on Trees 2008 57 / 128

Page 88: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

3 Conjunctive Queries over TreesDefinition, results and acyclic fragmentTwigs and Tree Patterns

4 Monadic Datalog

PhDs+S lawek (Mostrare) Queries on Trees 2008 58 / 128

Page 89: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Conjunctive Queries

... seen as FO formulas

∃x . φ(x , y) where φ is a conjunction of atomic predicates.For instance:

∃x∃y∃w R1(x) ∧ R2(x , y) ∧ R3(x ,w , z)

... seen as rules

answer(z)← R1(x),R2(x , y),R3(x ,w , z)

... seen as terms of the Projection/Join algebra

πZ ( R1(X ) ⋊⋉ R2(X ,Y ) ⋊⋉ R3(X ,W ,Z ) )

These 3 formalisms are equivalent (see [AHV95]).PhDs+S lawek (Mostrare) Queries on Trees 2008 59 / 128

Page 90: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Conjunctive Queries over Trees

XPath axis X : ch, ch∗, ch+, ns, ns∗, ns

+, following and their inverse

following = (ch∗)−1 ◦ ns+ ◦ ch∗

Example

∃x∃y ch+(x , y) ∧ ch+(x , z) ∧ following(x , z)

x

y z/ -

ch+ch+

following

PhDs+S lawek (Mostrare) Queries on Trees 2008 60 / 128

Page 91: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Boolean Queries

Theorem ([GKS04])

Evaluation of Boolean CQ over X is NP-complete, even on a fixed tree.

Tractable fragments

X underbar property

Acyclic conjunctive queries

Twigs

PhDs+S lawek (Mostrare) Queries on Trees 2008 61 / 128

Page 92: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

X property

R: a binary relation on the domain Dt of a tree t

a total order < on Dt

Definition

The relation R satisfies the X property wrt < if ∀n1, n2, n3, n4 st n1 < n2

and n3 < n4:

R

-

R

R

Rn1

n2

n3

n4

A set of relations R1, . . . ,Rn satisfies X wrt < if every Ri does.

PhDs+S lawek (Mostrare) Queries on Trees 2008 62 / 128

Page 93: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

X property: Example

{ch+, ch∗} for the preorder <pre (ch+(x , y)⇒ x <pre y)

{ch, ns, ns+, ns∗} for <bflr

but not following for <pre1

2

3 4

5

6 7>

U

following

following R

2

3

4

6

-following

following

PhDs+S lawek (Mostrare) Queries on Trees 2008 63 / 128

Page 94: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Dichotomy Result

Theorem (Gottlob, Koch, Schulz, 2004)

For all F ⊆ X , CQ[F ] Boolean queries can be evaluated in PTIME iffthere is a total order < such that F satisfies the X property wrt <.

Question: generelization to n-ary queries? Which complexity measure?→ polynomial in the number of answers.

PhDs+S lawek (Mostrare) Queries on Trees 2008 64 / 128

Page 95: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Acyclic Conjunctive Queries (ACQ)

Acyclic: the query graph is acyclic

∃x∃y∃z , ns(x , y) ∧ ch∗(y , z)

x

y z

ns ch∗

PhDs+S lawek (Mostrare) Queries on Trees 2008 65 / 128

Page 96: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressiveness

[GKS04]

CQ[X ] (⋃

ACQ[X ] ⊆ FO[X ]exponential

x

y z/ ^

-

ch∗ch∗

following

-

x

u

y

v

z

ch∗ ch∗

ch+

ns+=

? ?

-

[Mar05b], over unranked trees,

ACQ[FO2] = FOnary

PhDs+S lawek (Mostrare) Queries on Trees 2008 66 / 128

Page 97: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

ACQ Evaluation

Yannakakis algorithm: O(|q|.|db|.|q(db)|)

∃x R(x , y) ∧ R ′(x , z)

on trees t with predicates X : O(|q|.|t|2.|q(t)|)

∃x ch∗(x , y) ∧ ns∗(x , z)

PhDs+S lawek (Mostrare) Queries on Trees 2008 67 / 128

Page 98: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

3 Conjunctive Queries over TreesDefinition, results and acyclic fragmentTwigs and Tree Patterns

4 Monadic Datalog

PhDs+S lawek (Mostrare) Queries on Trees 2008 68 / 128

Page 99: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Twigs: Testing containment [MS02]

Tree pattern

(unordered and unranked) tree labeled with elements from Σ ∪ {∗}child and descendant edgesn distinguished querying nodes (n-ary query)unary tree patterns (n = 1) equivalent to XPath(∗, [], //, /)

a

b

∗x1

cx2

a

b c

d c

Ans(p, t) = {(1 · 1, 2), (1 · 1, 2 · 1)}

Containment (problem statement)

p1 ⊆ p2 if and only if Ans(p1, t) ⊆ Ans(p2, t) for every t ∈ TΣ

PhDs+S lawek (Mostrare) Queries on Trees 2008 69 / 128

Page 100: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Booleanize your twigs

Boolean tree patterns

Tree patterns p with no querying nodes (n = 0)

Mod(p) = {t ∈ TΣ|t satisfies p}

Then, p1 ⊆ p2 if and only if Mod(p1) ⊆ Mod(p2)

p :

a

b

∗x1

cx2 pB :

a

b

z1

c

z2

Proposition

For any two n-ary tree patterns p1 and p2: p1 ⊆ p2 ⇔ pB1 ⊆ pB

2

PhDs+S lawek (Mostrare) Queries on Trees 2008 70 / 128

Page 101: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Canonical models of Boolean twigs

p :

a

b

c

a

b

c

a

b

c

a

b

c

. . .

PhDs+S lawek (Mostrare) Queries on Trees 2008 71 / 128

Page 102: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Canonical models of Boolean twigs

p :

a

b

c

a

b

z

c

a

b

z

z

c

a

b

z

z

z

c

. . .

mod(p)

PhDs+S lawek (Mostrare) Queries on Trees 2008 71 / 128

Page 103: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Canonical models of Boolean twigs

p :

a

b

c

a

b

z

c

a

b

z

z

c

a

b

z

z

z

c

. . .

mod(p)

Proposition

For any Boolean tree patterns p1 and p2:

p1 ⊆ p2 ⇔ mod(p1) ⊆ Mod(p2).

PhDs+S lawek (Mostrare) Queries on Trees 2008 71 / 128

Page 104: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Testing containment of Boolean twigs: Outline

unrankeda

b

c

p

rankeda

b

cLp

Up

unrankeda

b

z

z

z

cmod(p)

PhDs+S lawek (Mostrare) Queries on Trees 2008 72 / 128

Page 105: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Testing containment of Boolean twigs: Outline

unrankeda

b

c

p

rankeda

b

cLp

Up

unrankeda

b

z

z

z

cmod(p)

Main idea

p1 ⊆ p2 ⇔ mod(p1) ⊆ Mod(p2)⇔ Up1(Lp1) ⊆ Mod(p2)

⇔ Lp1 ⊆ U−1p1

(Mod(p2))⇔ Ap1 ⊆ Ap2 ,

where:

Ap1 : DFTA defining Lp1

Ap2 : AFTA defining U−1p1

(Mod(p2)) complexity:O(|p1|2|p2|)

PhDs+S lawek (Mostrare) Queries on Trees 2008 72 / 128

Page 106: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Testing containment: Conclusions

Positive results

p1 ⊆ p2 can be decided in time O(|p1||p2|wd), where:

d is the number of //-edges in p1

w is the maximal length of ∗/ ∗ / . . . /∗ in p2

Negative results

Deciding containment is coNP-complete. The result holds even if we:

bound the number of occurrences of ∗

bound the degree of the nodes of tree patterns

PhDs+S lawek (Mostrare) Queries on Trees 2008 73 / 128

Page 107: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Efficient evaluation of tree patterns

TwigStack [BKS02]

Interval representation used with a variant of B-tree index

Two phase approach:1 Find and stack (partial) solutions to leaf-to-root paths2 Join partial solutions

Linear in the size of the input and output

I/O and CPU optimal if only //-edges used

Twig2Stack [CLT+06]

Generalized tree pattern queries

One phase bottom-up approach

May stack elements that are not solutions

In the worst case the whole document may be stored in main memory

HollisticTwigStack [JLH+07] addresses this shortcomingPhDs+S lawek (Mostrare) Queries on Trees 2008 74 / 128

Page 108: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

3 Conjunctive Queries over TreesDefinition, results and acyclic fragmentTwigs and Tree Patterns

4 Monadic Datalog

PhDs+S lawek (Mostrare) Queries on Trees 2008 75 / 128

Page 109: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Overview

Few words on datalog

Least fixed point

Monadic datalog over trees

PhDs+S lawek (Mostrare) Queries on Trees 2008 76 / 128

Page 110: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Datalog in (Very) Few Words

Language used in deductive databases

Extends conjunctive queries with recursion

Example: transitive closure of a graph

TC (x , y) :− Edge(x , y).TC (x , y) :− Edge(x , z),TC (z , y).

Model theoretic point of view:

∀x , y(Edge(x , y)→ TC (x , y))∀x , y , z((Edge(x , z) ∧ TC (z , y))→ TC (x , y))

Remark: no function symbols (finite models), no negation

See chapter 12 of [AHV95] for more details

PhDs+S lawek (Mostrare) Queries on Trees 2008 77 / 128

Page 111: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Least Fixed Point I

P is a fixed point of operator F if F (P) = P

The least fixed point lfp(F ) is the least element of the set of fixedpoints of F w.r.t. inclusion

Every monotone operator F (i.e., P ⊆ Q ⇒ F (P) ⊆ F (Q)) has aleast fixed point (Knaster-Tarski, cited by [Lib04]):

lfp(F ) =⋂

{P|F (P) = P}

Computing the least fixed point (standard closure):

P0 = ∅P i+1 = F (P i )lfp(F ) = P∞ =

⋃∞i=0 P i

Stabilizes after n steps on finite structures, i.e., P∞ = Pn

PhDs+S lawek (Mostrare) Queries on Trees 2008 78 / 128

Page 112: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Least Fixed Point IIDatalog immediate consequence operator TP(from [GK04]):

TP(Q) := Q ∪ {f | ∃φ,∃h :− b1, . . . , bn ∈ Pφ(h) = fφ(b1), . . . , φ(bn) ∈ Q }

Example: program P =

{

TC (x , y) :− Edge(x , y).TC (x , y) :− Edge(x , z),TC (z , y).

}

and

database Q = {Edge(1, 2),Edge(2, 3),Edge(3, 1)}

T 0P = Q = {Edge(1, 2),Edge(2, 3),Edge(3, 1)}

T 1P = T 0

P ∪ {TC (1, 2),TC (2, 3),TC (3, 1)}T 2P = T 1

P ∪ {TC (1, 3),TC (2, 1),TC (3, 2)}T 3P = T 2

P

Finally, lfp(TP)notation

= TωP = T 3

P = T 2P =

{Edge(1, 2),Edge(2, 3),Edge(3, 1),TC (1, 2),TC (2, 3),TC (3, 1), . . .}

PhDs+S lawek (Mostrare) Queries on Trees 2008 79 / 128

Page 113: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Monadic Datalog over Trees

Datalog with unary head predicates

Built-in predicates (for binary trees): root, leaf, (labela)a∈Σ, ch1,ch2

Example of query: select all nodes labeled by a at even height

Q0(x) :− root(x).Q(i+1)mod2(x) :− Qi (y), chk(y , x). (for k ∈ {1, 2})Ans(x) :− Q0(x), labela(x).

The query predicate is Ans

PhDs+S lawek (Mostrare) Queries on Trees 2008 80 / 128

Page 114: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Monadic Datalog over Trees: Complexity

Model Checking

Over ranked as well as unranked trees, monadic datalog hasO(|P| ∗ |dom|) combined complexity (theo. 4.2 of [GK04])

Proved by rewriting of P such that it is ground.

Satisfiability

Monadic datalog (over arbitrary finite structures) is NP-complete w.r.t.combined complexity (prop. 3.4 of [GK04])

Membership: guess a proof tree

Hardness: boolean conjuctive queries

For trees, satisfiability can be reduced to the emptiness problem forcontext-free languages [?]. What about the complexity?

PhDs+S lawek (Mostrare) Queries on Trees 2008 81 / 128

Page 115: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Monadic Datalog over Trees: Expressiveness

Equivalence with MSO

A tree language is definable in monadic datalog exactly if it is definable inMSO (coro. 4.7 of [GK04])

Sketch of proof (for monadic queries):

⇒ Encode the query defined by a monadic datalog program into anMSO formula (prop. 3.3 of [GK04])

⇐ More intricate, different ways of prooving it:1 Using ≡MSO

k -types (theo. 4.4 of [GK04])2 Simulating query automata of Neven & Schwentick [NS02] (Section

4.3 of [GK04])3 Encoding tree automata with selecting states? (next slides)

PhDs+S lawek (Mostrare) Queries on Trees 2008 82 / 128

Page 116: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Encoding a Tree Automaton A into a Monadic DatalogProgram P I

Rq(x) in lfp of P if a run of A can evaluate node x in state q:

a→ q ∈ rules(A)

Rq(x) :− leaf(x), labela(x).

f (q1, q2)→ q ∈ rules(A)

Rq(x) :− Rq1(y),Rq2(z), ch1(x , y), ch2(x , z), labelf (x).

PhDs+S lawek (Mostrare) Queries on Trees 2008 83 / 128

Page 117: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Encoding a Tree Automaton A into a Monadic DatalogProgram P II

L2Fq(x), aka LeadsToFinalq(x), in lfp of P if state q is used in a succesfulrun of A:

q ∈ final(A)

L2Fq(x) :− root(x).

f (q1, q2)→ q ∈ rules(A)

L2Fq1(y) :− L2Fq(x), ch1(x , y), ch2(x , z), labelf (x),Rq2(z).L2Fq2(z) :− L2Fq(x), ch1(x , y), ch2(x , z), labelf (x),Rq1(y).

PhDs+S lawek (Mostrare) Queries on Trees 2008 84 / 128

Page 118: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Encoding a Tree Automaton A into a Monadic DatalogProgram P III

Ans(x) in lfp of P if x is selected by automaton A, i.e., x is evaluated instate q ∈ S , where S ⊆ states(A) is the set of selecting states:

q ∈ S

Ans(x) :− Rq(x), L2Fq(x).

Proposition: Monadic datalog program P with Ans as query predicatesimulates tree automaton with selecting states A

PhDs+S lawek (Mostrare) Queries on Trees 2008 85 / 128

Page 119: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Part III

µ-calculus, Modal Logics (Temporal Logics, XPath...)

PhDs+S lawek (Mostrare) Queries on Trees 2008 86 / 128

Page 120: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

5 µ-calculus

6 XPath

7 Temporal Logics

PhDs+S lawek (Mostrare) Queries on Trees 2008 87 / 128

Page 121: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Structure and formulae

The structure used here is the one used by Barcelo and Libkin. Mostof the results are taken from [BL05a, ABL07].

Tree t with two relations (or more) on position: child ≺ch and nextsibling ≺ns

Formulae of Lµ[≺]:◮ constants a◮ second order variables X◮ ⊤,⊥,¬φ, φ ∨ φ′

◮ ⋄(≺)φ◮ µX .φ where X can only appears positively in φ

PhDs+S lawek (Mostrare) Queries on Trees 2008 88 / 128

Page 122: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Interpretation

Given a tree t, nodes s, s ′ ∈ Domain(t) and a valuationv : X → P(Domain(t))

logic operators are interpreted as usual

(t, v , s) |= a iff t(s) = a

(t, v , s) |= X iff s ∈ v(X )

(t, v , s) |= ⋄(≺)φ iff (t, v , s ′) |= ⋄(≺)φ for some s ′ such that s ≺ s ′

(t, v , s) |= µX .φ iff s ∈ S where S is the least fix point of Fφ, definedby Fφ(P) = {s ′ | (t, v [P/X ], s ′) |= φ}

PhDs+S lawek (Mostrare) Queries on Trees 2008 89 / 128

Page 123: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Interpretation

(t, v , s) |= µX .φ(X ) iff s ∈ S where S is the least fix point of F

Problem: is there a least fix point?

The function P 7→ {s ′ | (t, v [P/X ], s ′) |= φ} is monotonicallyincreasing because X can only appear positively in µX .φ

PhDs+S lawek (Mostrare) Queries on Trees 2008 90 / 128

Page 124: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Interpretation

(t, v , s) |= µX .φ(X ) iff s ∈ S where S is the least fix point of F

Problem: is there a least fix point?

The function P 7→ {s ′ | (t, v [P/X ], s ′) |= φ} is monotonicallyincreasing because X can only appear positively in µX .φ

◮ Fa,F⊤,F⊥,FY are constant

PhDs+S lawek (Mostrare) Queries on Trees 2008 90 / 128

Page 125: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Interpretation

(t, v , s) |= µX .φ(X ) iff s ∈ S where S is the least fix point of F

Problem: is there a least fix point?

The function P 7→ {s ′ | (t, v [P/X ], s ′) |= φ} is monotonicallyincreasing because X can only appear positively in µX .φ

◮ Fa,F⊤,F⊥,FY are constant◮ FX (P) = P is increasing

PhDs+S lawek (Mostrare) Queries on Trees 2008 90 / 128

Page 126: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Interpretation

(t, v , s) |= µX .φ(X ) iff s ∈ S where S is the least fix point of F

Problem: is there a least fix point?

The function P 7→ {s ′ | (t, v [P/X ], s ′) |= φ} is monotonicallyincreasing because X can only appear positively in µX .φ

◮ Fa,F⊤,F⊥,FY are constant◮ FX (P) = P is increasing◮ if Fφ and F ′

φ both are increasing (resp. decreasing), thenFφ∨φ′(P) = Fφ(P) ∪ Fφ′(P) is increasing (resp. decreasing)

PhDs+S lawek (Mostrare) Queries on Trees 2008 90 / 128

Page 127: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Interpretation

(t, v , s) |= µX .φ(X ) iff s ∈ S where S is the least fix point of F

Problem: is there a least fix point?

The function P 7→ {s ′ | (t, v [P/X ], s ′) |= φ} is monotonicallyincreasing because X can only appear positively in µX .φ

◮ Fa,F⊤,F⊥,FY are constant◮ FX (P) = P is increasing◮ if Fφ and F ′

φ both are increasing (resp. decreasing), thenFφ∨φ′(P) = Fφ(P) ∪ Fφ′(P) is increasing (resp. decreasing)

◮ if Fφ is increasing (resp. decreasing) then F⋄(≺)φ is increasing (resp.decreasing)

PhDs+S lawek (Mostrare) Queries on Trees 2008 90 / 128

Page 128: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Interpretation

(t, v , s) |= µX .φ(X ) iff s ∈ S where S is the least fix point of F

Problem: is there a least fix point?

The function P 7→ {s ′ | (t, v [P/X ], s ′) |= φ} is monotonicallyincreasing because X can only appear positively in µX .φ

◮ Fa,F⊤,F⊥,FY are constant◮ FX (P) = P is increasing◮ if Fφ and F ′

φ both are increasing (resp. decreasing), thenFφ∨φ′(P) = Fφ(P) ∪ Fφ′(P) is increasing (resp. decreasing)

◮ if Fφ is increasing (resp. decreasing) then F⋄(≺)φ is increasing (resp.decreasing)

◮ if Fφ is increasing (resp. decreasing), then F¬φ(P) = Fφ(P) isdecreasing (resp. increasing)

PhDs+S lawek (Mostrare) Queries on Trees 2008 90 / 128

Page 129: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Interpretation

(t, v , s) |= µX .φ(X ) iff s ∈ S where S is the least fix point of F

Problem: is there a least fix point?

The function P 7→ {s ′ | (t, v [P/X ], s ′) |= φ} is monotonicallyincreasing because X can only appear positively in µX .φ

◮ Fa,F⊤,F⊥,FY are constant◮ FX (P) = P is increasing◮ if Fφ and F ′

φ both are increasing (resp. decreasing), thenFφ∨φ′(P) = Fφ(P) ∪ Fφ′(P) is increasing (resp. decreasing)

◮ if Fφ is increasing (resp. decreasing) then F⋄(≺)φ is increasing (resp.decreasing)

◮ if Fφ is increasing (resp. decreasing), then F¬φ(P) = Fφ(P) isdecreasing (resp. increasing)

◮ if Fφ is increasing then FµX .φ(P) is increasing

PhDs+S lawek (Mostrare) Queries on Trees 2008 90 / 128

Page 130: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Unary and boolean queries

A formula φ from Lµ can be used as a unary query which selects in t thenodes s such that

(t, ., s) |= φ

PhDs+S lawek (Mostrare) Queries on Trees 2008 91 / 128

Page 131: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Unary and boolean queries

A formula φ from Lµ can be used as a unary query which selects in t thenodes s such that

(t, ., s) |= φ

A formula φ from Lµ can be used as a boolean query which accepts a treet iff

(t, ., ε) |= φ

Example

Selects nodes which are ancestors of a node labelled by a:

µX .(a ∨ ⋄(≺ch)X ).

PhDs+S lawek (Mostrare) Queries on Trees 2008 91 / 128

Page 132: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressivenessof boolean queries

Lµ[≺ch,≺ns] cannot express first child...◮ Lµ[≺ch,≺ns,≺fc]◮ Lfull

µ [≺ch,≺ns]: one can use ⋄(≺−)φ, where s ≺− s ′ iff s ′ ≺ s

Lµ[≺ch,≺ns,≺fc] = Lfullµ [≺ch,≺ns] = MSO

PhDs+S lawek (Mostrare) Queries on Trees 2008 92 / 128

Page 133: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Lµ[≺ch,≺ns,≺fc] ⊆ MSO

One can rewrite any Lµ[≺ch,≺ns,≺fc] formula into a MSO query asfollow:

〈a〉(x) = labela(x)

〈µX .φ〉(z) = ∃X (z ∈ X ∧ ∀x ∈ X ⇒ 〈φ〉(x) ∧ (∀Y (∀y ∈ Y ⇒〈φ〉(y))⇒ X ⊆ Y ))

PhDs+S lawek (Mostrare) Queries on Trees 2008 93 / 128

Page 134: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Lµ[≺ch,≺ns,≺fc] ⊆ MSO

One can rewrite any Lµ[≺ch,≺ns,≺fc] formula into a MSO query asfollow:

〈a〉(x) = labela(x)

〈X 〉(x) = x ∈ X

〈µX .φ〉(z) = ∃X (z ∈ X ∧ ∀x ∈ X ⇒ 〈φ〉(x) ∧ (∀Y (∀y ∈ Y ⇒〈φ〉(y))⇒ X ⊆ Y ))

PhDs+S lawek (Mostrare) Queries on Trees 2008 93 / 128

Page 135: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Lµ[≺ch,≺ns,≺fc] ⊆ MSO

One can rewrite any Lµ[≺ch,≺ns,≺fc] formula into a MSO query asfollow:

〈a〉(x) = labela(x)

〈X 〉(x) = x ∈ X

〈⋄(≺ch)φ〉(x) = ∃y | ch(y , x) ∧ 〈φ〉(y), ...

〈µX .φ〉(z) = ∃X (z ∈ X ∧ ∀x ∈ X ⇒ 〈φ〉(x) ∧ (∀Y (∀y ∈ Y ⇒〈φ〉(y))⇒ X ⊆ Y ))

PhDs+S lawek (Mostrare) Queries on Trees 2008 93 / 128

Page 136: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Lµ[≺ch,≺ns,≺fc] ⊆ MSO

One can rewrite any Lµ[≺ch,≺ns,≺fc] formula into a MSO query asfollow:

〈a〉(x) = labela(x)

〈X 〉(x) = x ∈ X

〈⋄(≺ch)φ〉(x) = ∃y | ch(y , x) ∧ 〈φ〉(y), ...

〈µX .φ〉(z) = ∃X (z ∈ X ∧ ∀x ∈ X ⇒ 〈φ〉(x) ∧ (∀Y (∀y ∈ Y ⇒〈φ〉(y))⇒ X ⊆ Y ))

PhDs+S lawek (Mostrare) Queries on Trees 2008 93 / 128

Page 137: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Lµ[≺ch,≺ns,≺fc] ⊆ MSO

One can rewrite any Lµ[≺ch,≺ns,≺fc] formula into a MSO query asfollow:

〈a〉(x) = labela(x)

〈X 〉(x) = x ∈ X

〈⋄(≺ch)φ〉(x) = ∃y | ch(y , x) ∧ 〈φ〉(y), ...

〈µX .φ〉(z) = ∃X (z ∈ X ∧ ∀x ∈ X ⇒ 〈φ〉(x) ∧ (∀Y (∀y ∈ Y ⇒〈φ〉(y))⇒ X ⊆ Y ))

Finally, the whole query will be ∃x root(x) ∧ 〈φ〉(x)

PhDs+S lawek (Mostrare) Queries on Trees 2008 93 / 128

Page 138: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

MSO ⊆ Lµ[≺ch,≺ns,≺fc]

Given a MSO query, let A be an equivalent deterministic automaton. Wecan encode A with a Lµ[≺ch,≺ns,≺fc] formula.

Example

On ranked trees, ≺ch1,≺ch2,Automaton Q = {qa, qb} ,QF = {qa}a→ qa, b → qb, f (qa, qb)→ qa, f (qb, qa)→ qb

µXa.a ∨ f ∧ ⋄(≺ch1)Xa ∧ ⋄(≺ch2)(µXb.b ∨ f ∧ ⋄(≺ch1)Xa ∧ ⋄(≺ch2))

PhDs+S lawek (Mostrare) Queries on Trees 2008 94 / 128

Page 139: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressivenessof unary queries

Lµ[≺ch,≺ns,≺fc] cannot express root...

we need to use Lfullµ [≺ch,≺ns]

Lfullµ [≺ch,≺ns] = MSO

PhDs+S lawek (Mostrare) Queries on Trees 2008 95 / 128

Page 140: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressivenessof unary queries

Lµ[≺ch,≺ns,≺fc] cannot express root...

we need to use Lfullµ [≺ch,≺ns]

Lfullµ [≺ch,≺ns] = MSO

PhDs+S lawek (Mostrare) Queries on Trees 2008 95 / 128

Page 141: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressivenessof unary queries

Lµ[≺ch,≺ns,≺fc] cannot express root...

we need to use Lfullµ [≺ch,≺ns]

Lfullµ [≺ch,≺ns] = MSO

Proofs: similar to Boolean queries, but with query automata instead

PhDs+S lawek (Mostrare) Queries on Trees 2008 95 / 128

Page 142: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Complexities

Because the structure of trees are acyclic, model checking ofLfull

µ [≺ch,≺sb] can be computed in O(|φ|2 |t|). Can be reduced for asubclass of Lµ (as expressive as MSO) to O(|φ| |t|)

Satisfiability of Lfullµ [≺ch,≺sb] is EXPTIME (slightly better bounds in

the case of tree than in the general case)

PhDs+S lawek (Mostrare) Queries on Trees 2008 96 / 128

Page 143: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

5 µ-calculus

6 XPath

7 Temporal Logics

PhDs+S lawek (Mostrare) Queries on Trees 2008 97 / 128

Page 144: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

First-order modal logicson Unranked Trees

Strong links between:

XPath

Modal Logics (temporal, propositional...)

FO

PhDs+S lawek (Mostrare) Queries on Trees 2008 98 / 128

Page 145: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

First-order modal logicson Unranked Trees

Strong links between:

XPath

Modal Logics (temporal, propositional...)

FO

→ remember the first slides about the model and FO

PhDs+S lawek (Mostrare) Queries on Trees 2008 98 / 128

Page 146: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

First-order modal logicson Unranked Trees

Strong links between:

XPath

Modal Logics (temporal, propositional...)

FO

→ remember the first slides about the model and FO→ we won’t talk about L-definability (i.e., given an automaton, is itequivalent to a formula of the logic L?). See [Boj08a] for a survey.

PhDs+S lawek (Mostrare) Queries on Trees 2008 98 / 128

Page 147: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Binary vs Unranked Trees

FO-definable queries on binary trees?

“select trees with even number of nodes”

PhDs+S lawek (Mostrare) Queries on Trees 2008 99 / 128

Page 148: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Binary vs Unranked Trees

FO-definable queries on binary trees?

“select trees with even number of nodes” X (always false)

PhDs+S lawek (Mostrare) Queries on Trees 2008 99 / 128

Page 149: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Binary vs Unranked Trees

FO-definable queries on binary trees?

“select trees with even number of nodes” X (always false)

“select trees with even number of a-nodes”

PhDs+S lawek (Mostrare) Queries on Trees 2008 99 / 128

Page 150: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Binary vs Unranked Trees

FO-definable queries on binary trees?

“select trees with even number of nodes” X (always false)

“select trees with even number of a-nodes” x

PhDs+S lawek (Mostrare) Queries on Trees 2008 99 / 128

Page 151: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Binary vs Unranked Trees

FO-definable queries on binary trees?

“select trees with even number of nodes” X (always false)

“select trees with even number of a-nodes” x

“select trees that have a leaf of even depth”

PhDs+S lawek (Mostrare) Queries on Trees 2008 99 / 128

Page 152: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Binary vs Unranked Trees

FO-definable queries on binary trees?

“select trees with even number of nodes” X (always false)

“select trees with even number of a-nodes” x

“select trees that have a leaf of even depth” X (zigzag technic)

PhDs+S lawek (Mostrare) Queries on Trees 2008 99 / 128

Page 153: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Binary vs Unranked Trees

FO-definable queries on binary trees?

“select trees with even number of nodes” X (always false)

“select trees with even number of a-nodes” x

“select trees that have a leaf of even depth” X (zigzag technic)

not clear whether the last query is FO-definable on unranked trees.

PhDs+S lawek (Mostrare) Queries on Trees 2008 99 / 128

Page 154: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

XPath 1.0: a W3C recommendation (since 1999)

Example:/descendant :: a[position() > last() ∗ 0.5 or self :: ∗ = 100]Features:

select nodes (monadic queries)

navigation through axis (child... following, preceding)

node test and filters: /ax1::ntst1[f1][f2[f3]]/...

context-sensitive functions (position, last...)

element types (element, attribute, instruction, comments)

arithmetic operators (+,−...)

data operators/comparators (string-length...)

aggregators (count, sum...)

identifiers functions...

type conversion functions...

PhDs+S lawek (Mostrare) Queries on Trees 2008 100 / 128

Page 155: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

XPath axes [Shi08]

PhDs+S lawek (Mostrare) Queries on Trees 2008 101 / 128

Page 156: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

XPath 1.0

first implementations: exponential time in the size of the query

PTIME combined complexity obtained in [GKP02, GKP03a]:O(|D|2.|Q|4) in time, O(|D|2.|Q|2) in space.

PhDs+S lawek (Mostrare) Queries on Trees 2008 102 / 128

Page 157: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

XPath 1.0

first implementations: exponential time in the size of the query

PTIME combined complexity obtained in [GKP02, GKP03a]:O(|D|2.|Q|4) in time, O(|D|2.|Q|2) in space.

Questions:

linear time fragment?

expressiveness? links to other logics?

PhDs+S lawek (Mostrare) Queries on Trees 2008 102 / 128

Page 158: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

CoreXPathThe navigational core of XPath

defined by Gottlob, Koch and Pichler [GKP02, GKP03a]

restriction to navigation through axis, filters, and nodetests

locpath ::= axis :: ntst | axis :: ntst[fexpr ] | /locpath | locpath/locpathfexpr ::= locpath | not fexpr | fexpr and fexpr | fexpr or fexpr

axis ::= self | ch | ch+ | ch∗ | ch−1 | ch−1

+ | ch−1∗ | ns+ | ns

−1+

ntst ::= a, a ∈ Σ | ∗

document order axis following and preceding are syntactic sugar:

following :: ntst[fexpr ] ≡ ch−1∗ :: ∗/ns+ :: ∗/ch∗ :: ntst[fexpr ]

preceding :: ntst[fexpr ] ≡ ch−1∗ :: ∗/ns−1

+ :: ∗/ch∗ :: ntst[fexpr ]

PhDs+S lawek (Mostrare) Queries on Trees 2008 103 / 128

Page 159: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

CoreXPath complexity [GKP03b]

query evaluation becomes linear: O(|D|.|Q|)

it is P-hard wrt. combined complexity...

... even when t is limited to depth 3 and only axes ch, ch−1, ch∗ areallowed

Positive-CoreXPath is LOGCFL-complete

satisfiability is EXPTIME-complete

PhDs+S lawek (Mostrare) Queries on Trees 2008 104 / 128

Page 160: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

CoreXPath expressiveness

CoreXPath ⊆ FO

CoreXPath /ch+ :: a [ch :: b] /ch :: c

variables y z x

φ(x) = ∃y . labela(y) ∧ ∃z . labelc(z) ∧ labelc(x)∧ ch(y , z) ∧ ch(y , x)

PhDs+S lawek (Mostrare) Queries on Trees 2008 105 / 128

Page 161: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

CoreXPath expressiveness

CoreXPath ⊆ FO

CoreXPath /ch+ :: a [ch :: b] /ch :: c

variables y z x

φ(x) = ∃y . labela(y) ∧ ∃z . labelc(z) ∧ labelc(x)∧ ch(y , z) ∧ ch(y , x)

FO * CoreXPath

example: select root if the leaf language is (ab)∗.

in fact, CoreXPath = FO21 [Mar05b]

PhDs+S lawek (Mostrare) Queries on Trees 2008 105 / 128

Page 162: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

ExpressivenessA B A ( B

A B A ⊆ B

A B A * B

FO1

FO21 = CoreXPath

PhDs+S lawek (Mostrare) Queries on Trees 2008 106 / 128

Page 163: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

CondXPath [Mar04]Conditional XPath

CondXPath =CoreXPath+ axis: ns, ns∗, ns

−1, ns−1∗

+ until operator: (axis :: ntst[fexpr ])+ with axis ∈ {ch, ch−1, ns, ns−1}

CondXPath has the same complexity as CoreXPath (for both queryevaluation and satisiability).

PhDs+S lawek (Mostrare) Queries on Trees 2008 107 / 128

Page 164: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

CondXPath [Mar04]Conditional XPath

CondXPath =CoreXPath+ axis: ns, ns∗, ns

−1, ns−1∗

+ until operator: (axis :: ntst[fexpr ])+ with axis ∈ {ch, ch−1, ns, ns−1}

CondXPath has the same complexity as CoreXPath (for both queryevaluation and satisiability).

CondXPath ⊆ FO

For instance (ch :: a[ns∗ :: b])+ translates to the FO formula:φ(x , y) =∃z . ns∗(y , z) ∧ labelb(z) ∧¬(∃s. ch∗(x , s) ∧ ch∗(s, y) ∧ (¬labela(s) ∨ ¬∃s ′. ns∗(s, s ′) ∧ labelb(s ′)))

PhDs+S lawek (Mostrare) Queries on Trees 2008 107 / 128

Page 165: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

CondXPath [Mar04]Expressiveness

How to prove that FO ⊆ CondXPath?

PhDs+S lawek (Mostrare) Queries on Trees 2008 108 / 128

Page 166: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

CondXPath [Mar04]Expressiveness

How to prove that FO ⊆ CondXPath?Marx uses an intermediate logic: Xuntil.

CondXPath FO

Xuntil

previous slide

21

PhDs+S lawek (Mostrare) Queries on Trees 2008 108 / 128

Page 167: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Xuntil

Syntax

ϕ ::≡ a | ⊤ | ¬ϕ | ϕ ∧ ϕ′ | θ(ϕ,ϕ′) (a ∈ Σ, θ ∈ {⇓,⇐,⇒,⇑})

Arrows are interpreted as transitive closures of corresponding axis.

Semantics(t, π) |= a iff labelt

a(π)(t, π) |= ¬ϕ iff (t, π) 6|= ϕ(t, π) |= ϕ ∧ ϕ′ iff (t, π) |= ϕ and (t, π) |= ϕ′

(t, π) |= θ(ϕ,ϕ′) iff there exists π′ s.t. θ+(π, π′) and (t, π′) |= ϕand for all π′′ s.t. πθ+π′′θ+π′, (t, π′′) |= ϕ′

PhDs+S lawek (Mostrare) Queries on Trees 2008 109 / 128

Page 168: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

1. From Xuntil to CondXPath

Xuntil → CondXPath

r(a) = self :: ar(¬ϕ) = not r(ϕ)

r(ϕ ∧ ϕ′) = r(ϕ) and r(ϕ′)r(θ(ϕ,ϕ′)) = θ :: ∗[r(ϕ)] or (θ :: ∗[r(ϕ′)])+/θ :: ∗[r(ϕ)]

PhDs+S lawek (Mostrare) Queries on Trees 2008 110 / 128

Page 169: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

2. From FO to XuntilSeparation technic

Theorem ([GHR94], adapted in [Mar04])

If every Xuntil formula is separable over trees, then Xuntil is FO-expressive.

“ϕ separable” means:equivalent to a Booleancombination of purepast/present/future/left/rightformula

PhDs+S lawek (Mostrare) Queries on Trees 2008 111 / 128

Page 170: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

2. From FO to XuntilSeparation technic

Theorem ([Mar04])

Each Xuntil formula is separable.

Query rewriting... with blowup.

PhDs+S lawek (Mostrare) Queries on Trees 2008 112 / 128

Page 171: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Alternative proof

Theorem ([Mar05b])

any expansion of CoreXPath which is closed under complementationis FO-expressive

CondXPath is closed under complementation

PhDs+S lawek (Mostrare) Queries on Trees 2008 113 / 128

Page 172: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

ExpressivenessA B A ( B

A B A ⊆ B

A B A * B

FO1 = CondXPath

FO21 = CoreXPath

PhDs+S lawek (Mostrare) Queries on Trees 2008 114 / 128

Page 173: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

RegularXPath≈ [tC06]

RegularXPath =

CoreXPath+ axis: ns, ns∗, ns

−1, ns−1∗

+ transitive closure: (RegularXPath expression)∗

RegularXPath≈ =

RegularXPath+ loop predicate: [loop(ϕ)]t = {π ∈ Dt | (π, π) ∈ [ϕ]t}

Both have PTIME combined complexity for query evaluation.

PhDs+S lawek (Mostrare) Queries on Trees 2008 115 / 128

Page 174: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

RegularXPath≈ [tC06]Expressiveness

Theorem

RegularXPath≈ and FO + TC 1 have the same expressive power.

In a preceding section, we saw that FO + TC 1 is strictly less expressivethan MSO [tCS08].

Corollary

The class of binary relations definable in RegularXPath≈ is closed underintersection and complementation.

It is only conjectured that adding loop increases expressivity, i.e., thatRegularXPath ( RegularXPath≈.

PhDs+S lawek (Mostrare) Queries on Trees 2008 116 / 128

Page 175: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

ExpressivenessA B A ( B

A B A ⊆ B

A B A * B MSO

FO + TC1 = RegularXPath≈

FO = CondXPath

FO2 = CoreXPath

[BC05, BSSS06]

[tCS08]

PhDs+S lawek (Mostrare) Queries on Trees 2008 117 / 128

Page 176: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

RegularXPath variants

µRegularXPath adds a fixed-point operator [tC06] → MSO

RegularXPath(W ) adds a “subtree relativisation operator” [tCS08]

Beware: RegularXPath(W ) = FO + TC 1p , whereas

RegularXPath≈ = FO + TC 1. Remind that it is not known whether theinclusion FO + TC 1 ⊆ FO + TC 1

p is strict.

PhDs+S lawek (Mostrare) Queries on Trees 2008 118 / 128

Page 177: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

XPath 2.0

XPath 2.0

adds the following features to XPath:

for loops: for $i in R return S

Boolean intersection (intersect) and complementation (except) onpath expressions

variables: n-ary queries

node comparison tests (is)

PhDs+S lawek (Mostrare) Queries on Trees 2008 119 / 128

Page 178: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

CoreXPath 2 [tCM07]

CoreXPath 2

for loops are interpreted as sets of nodes, not sequences

no positional/aggregate: position(), last(), count()

no value comparison operators

PhDs+S lawek (Mostrare) Queries on Trees 2008 120 / 128

Page 179: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

CoreXPath 2 [tCM07]

CoreXPath 2

for loops are interpreted as sets of nodes, not sequences

no positional/aggregate: position(), last(), count()

no value comparison operators

adding the last 2 features leads to undecidability.

equivalence of CoreXPath 2 queries is decidable.

of course, CoreXPath 2 is FO-expressive (adding except toCoreXPath is already sufficient).

CoreXPath 2 ↔ FO translations in linear time

PhDs+S lawek (Mostrare) Queries on Trees 2008 120 / 128

Page 180: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Outline

5 µ-calculus

6 XPath

7 Temporal Logics

PhDs+S lawek (Mostrare) Queries on Trees 2008 121 / 128

Page 181: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Preliminaries: Linear Temporal Logic (LTL)

Syntax

ϕ ::≡ a | ¬ϕ | ϕ ∨ ψ | Xϕ | X− ϕ | ϕU ψ | ϕ S ψ

Semantics

Structure: s = s0s1 · · · sn a string over ΣInterpretation: (s, i) |= ϕ (ϕ is satisfied in s at position i)label (s, i) |= a iff si = a (i.e. labels(i) = a)next (s, i) |= Xϕ iff (s, i + 1) |= ϕprev (s, i) |= X− ϕ iff (s, i − 1) |= ϕuntil (s, i) |= ϕU ψ iff ∃j ≥ i . (s, j) |= ψ ∧ ∀k ∈ {i , . . . , j − 1}. (s, k) |= ϕsince (s, i) |= ϕ S ψ iff ∃j ≤ i . (s, j) |= ψ ∧ ∀k ∈ {j + 1, . . . , i}. (s, k) |= ϕ

PhDs+S lawek (Mostrare) Queries on Trees 2008 122 / 128

Page 182: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Querying and expressivity

LTL Boolean queries

QA(ϕ, s) = true⇔ (s, 0) |= ϕ

LTL unary queries

QA(ϕ, s) ={

i ∈ {0, . . . , |s|} : (s, i) |= ϕ}

Kamp’s Theorem.

Over strings, LTL = FO

PhDs+S lawek (Mostrare) Queries on Trees 2008 123 / 128

Page 183: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree Temporal Logic TLtree

Syntax

ϕ ::≡ a | ¬ϕ | ϕ ∨ ψ | Xθ ϕ | X−θ ϕ | ϕUθ ψ | ϕ Sθ ψ (θ ∈ {↓,←})

Semantics

(t, π) |= ϕ reads “ϕ is satisfied in t at node π”(t, π) |= a iff labelt(π) = a(t, π) |= X↓ϕ iff ∃π′ such that π ↓ π′ and (t, π′) |= ϕ.etc.

Theorem [Mar05a]

Over unranked ordered trees, TLtree = FO (Boolean and unary queries)

PhDs+S lawek (Mostrare) Queries on Trees 2008 124 / 128

Page 184: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Computational tree logic CTL∗past

Syntax

Node formulas: Φ ::≡ a | ¬Φ | Φ ∨Ψ | E↓ ϕ | E→ ϕ

Path formulas: ϕ = Φ? | ¬ϕ | ϕ ∨ ψ | Xϕ | X− ϕ | ϕU ψ | ϕ S ψ

Semantics

(t, π) |= Φ reads “Φ is satisfied in t at node π”(t, π) |= E↓ϕ iff ∃π1 ↓ · · · ↓ πi−1 ↓ π ↓ πi+1 ↓ · · · ↓ πk such that(π1 · · ·πk , i) |=p ϕ where:(π1 · · ·πk , i) |=p Φ? iff (t, πi ) |= Φ etc.

Theorem [BL05b]

Over unranked ordered trees, CTL∗past = FO (Boolean and unary queries)

PhDs+S lawek (Mostrare) Queries on Trees 2008 125 / 128

Page 185: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Propositional Dynamic Logic for trees PDLtree [ABD+05]

Syntax

Path formulas:

σ ::≡ ← | → | ↓ | ↑ | σ/σ′ | σ ∪ σ′ | σ∗ | ϕ?

Propositions:ϕ ::≡ a | ¬ϕ | ϕ ∨ ψ | Xσ ϕ |

Semantics

σ defines a binary relation [[σ]]t on nodes of t(t, π) |= Xσ ϕ iff ∃π′ such that π [[σ]]t π

′ and (t, π′) |= ϕ

PhDs+S lawek (Mostrare) Queries on Trees 2008 126 / 128

Page 186: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Expressivity of PDLtree [ABD+05]

Theorem

PDLtree is equivalent to Regular XPath.

Theorem

PDLtree restricted to

σ ::≡ ← | → | ↓ | ↑ | σ∗ | σ/ϕ?

is equivalent to Conditional XPath which is equivalent to FO.

Theorem

PDLtree restricted toσ ::≡ ← | → | ↓ | ↑ | σ∗

is equivalent to Core XPath which is equivalent to FO2.

PhDs+S lawek (Mostrare) Queries on Trees 2008 127 / 128

Page 187: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

References

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 188: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

[ABD+05] Loredana Afanasiev, Patrick Blackburn, Ioanna Dimitriou,Bertrand Gaiffe, Evan Goris, Maarten Marx, and Maartende Rijke.PDL for ordered trees.Journal of Applied Non-Classical Logics, 15(2):115–135, 2005.

[ABL07] Marcelo Arenas, Pablo Barcelo, and Leonid Libkin.Combining temporal logics for querying XML documents.In Springer-Verlag, editor, Proceedings of InternationalConference on Database Theory, volume 4353 of LectureNotes in Computer Science, pages 359–374, 2007.

[AHV95] Serge Abiteboul, Richard Hull, and Victor Vianu.Foundations of Databases.1995.

[AU71] A. V. Aho and J. D. Ullmann.Translations on a context-free grammar.Information and Control, 19:439–475, 1971.

[BC04] Miko laj Bojanczyk and Thomas Colcombet.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 189: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Tree-walking automata cannot be determinized.In 31st International Colloquium on Automata, Languagesand Programming, Lecture Notes in Computer Science, pages246–256. Springer Verlag, 2004.

[BC05] Miko laj Bojanczyk and Thomas Colcombet.Tree-walking automata do not recognize all regular languages.In 37th Annual ACM Symposium on Theory of Computing,pages 234–243, New York, NY, USA, 2005. ACM-Press.

[BDM+06] Miko laj Bojanczyk, Claire David, Anca Muscholl, ThomasSchwentick, and Luc Segoufin.Two-variable logic on data trees and XML reasoning.In Twenty-fifth ACM SIGACT-SIGMOD-SIGART Symposiumon Principles of Database Systems, pages 10–19, 2006.

[BKS02] Nicolas Bruno, Nick Koudas, and Divesh Srivastava.Holistic twig joins: optimal xml pattern matching.In SIGMOD Conference, pages 310–321, 2002.

[BKW98] Anne Bruggemann-Klein and Derick Wood.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 190: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

One-unambiguous regular languages.Information and Computation, 142(2):182–206, May 1998.

[BKW00] Anne Bruggemann-Klein and Derick Wood.Caterpillars: A context specification technique.Markup Languages, 2(1):81–106, 2000.

[BKWM01] Anne Bruggemann-Klein, Derick Wood, and Makoto Murata.Regular tree and regular hedge languages over unrankedalphabets: Version 1, April 07 2001.

[BL05a] Pablo Barcelo and Leonid Libkin.Temporal logics over unranked trees.In Proceedings of the IEEE Symposium on Logic in ComputerScience, pages 31–40, 2005.

[BL05b] Pablo Barcelo and Leonid Libkin.Temporal logics over unranked trees.In 20th Annual IEEE Symposium on Logic in ComputerScience, pages 31–40. IEEE Comp. Soc. Press, 2005.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 191: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

[BMS+06] Miko laj Bojanczyk, Anca Muscholl, Thomas Schwentick, LucSegoufin, and Claire David.Two-variable logic on words with data.In 21st Annual IEEE Symposium on Logic in ComputerScience, pages 7–16. IEEE Comp. Soc. Press, 2006.

[Boj04] Miko laj Bojanczyk.Decidable Properties of Tree Languages.PhD thesis, Warsaw University, 2004.

[Boj08a] Miko laj Bojanczyk.Effective characterizations of tree logics, 2008.PODS’08 Keynote.

[Boj08b] Miko laj Bojanczyk.Tree-walking automata.Tutorial at LATA’08, 2008.

[BS05] Michael Benedikt and Luc Segoufin.Regular tree languages definable in FO and FOmod.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 192: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

In 22nd International Symposium on Theoretical Aspects ofComputer Science, volume 3404 of Lecture Notes inComputer Science, pages 327–339. Springer Verlag, 2005.

[BSSS06] Miko laj Bojanczyk, Mathias Samuelides, Thomas Schwentick,and Luc Segoufin.Expressive power of pebbles automata.In International Colloquium on Automata Languages andProgramming (ICALP’06), Lecture Notes in ComputerScience, pages 157–168. Springer Verlag, 2006.

[CLT+06] Songting Chen, Hua-Gang Li, Jun’ichi Tatemura, Wang-PinHsiung, Divyakant Agrawal, and K. Selcuk Candan.Twig2stack: Bottom-up processing of generalized-tree-patternqueries over xml documents.In VLDB, pages 283–294, 2006.

[CM01] James Clark and Murata Makoto.Relax ng specification, 2001.

[CNT04] Julien Carme, Joachim Niehren, and Marc Tommasi.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 193: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Querying unranked trees with stepwise tree automata.In 19th International Conference on Rewriting Techniques andApplications, volume 3091 of Lecture Notes in ComputerScience, pages 105–118. Springer Verlag, 2004.

[EF99] H. Ebbinghaus and J. Flum.Finite Model Theory.Springer Verlag, Berlin, 1999.

[EH99] Joost Engelfriet and Hendrik Jan Hoogeboom.Tree-walking pebble automata.In Juhani Karhumaki, Hermann A. Maurer, Gheorghe Paun,and Grzegorz Rozenberg, editors, Jewels are Forever,Contributions on Theoretical Computer Science in Honor ofArto Salomaa, pages 72–83, London, UK, 1999.Springer-Verlag.

[EH06] Joost Engelfriet and Hendrik Jan Hogeboom.Nested pebbles and transitive closure.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 194: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

In 23rd Symposium on Theoretical Aspects of ComputerScience, volume 3884 of Lecture Notes in Computer Science,pages 477–488. Springer Verlag, 2006.

[EHS07] Joost Engelfriet, Hendrik Jan Hoogeboom, and Bart Samwel.Xml transformation by tree-walking transducers with invisiblepebbles.In PODS ’07: Proceedings of the twenty-sixth ACMSIGMOD-SIGACT-SIGART symposium on Principles ofdatabase systems, pages 63–72, New York, NY, USA, 2007.ACM.

[EI95] Kousha Etessami and Neil Immerman.Reachability and the power of local ordering.Theoretical Computer Science, 148(2):227–260, 1995.

[Fag75] Ronald Fagin.Monadic generalized spectra.Zeitschrift fur Mathematische Logik und Grundlagen derMathematik, 21:89–96, 1975.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 195: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

[FG02] Markus Frick and Martin Grohe.The complexity of first-order and monadic second-order logicrevisited.In LICS ’02: Proceedings of the 17th Annual IEEESymposium on Logic in Computer Science, pages 215–224,Washington, DC, USA, 2002.

[FGK03] Markus Frick, Martin Grohe, and Christoph Koch.Query evaluation on compressed trees.In 18th IEEE Symposium on Logic in Computer Science,pages 188–197, 2003.

[GHR94] D.M. Gabbay, I. Hodkinson, and M. Reynolds.Temporal Logic (Volume 1: Mathematical Foundations andComputational Aspects).Oxford Science Publications, 1994.

[GK04] Georg Gottlob and Christoph Koch.Monadic datalog and the expressive power of languages forweb information extraction.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 196: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Journal of the ACM, 51(1):74–113, 2004.

[GKP02] Georg Gottlob, Christoph Koch, and Reinhard Pichler.Efficient algorithms for processing xpath queries.In 28th International Conference on Very Large Data Bases,pages 95–106, Hong Kong, 2002.

[GKP03a] G. Gottlob, C. Koch, and R. Pichler.Xpath query evaluation: Improving time and space efficiency.In In Proceedings of the 19th IEEE International Conferenceon Data Engineering (ICDE 03), 2003.

[GKP03b] Georg Gottlob, Christoph Koch, and Reinhard Pichler.The complexity of xpath query evaluation.In 22nd ACM SIGMOD-SIGACT-SIGART Symposium onPrinciples of Database Systems, pages 179–190, 2003.

[GKS04] Georg Gottlob, Christoph Koch, and Klaus U. Schulz.Conjunctive queries over trees.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 197: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

In Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGARTSymposium on Principles of Database Systems, pages189–200, New York, NY, USA, 2004. ACM-Press.

[GO99] Erich Gradel and Martin Otto.On logics with two variables.Theoretical Computer Science, 224:73–113, 1999.

[HP03] Haruo Hosoya and Benjamin Pierce.Regular expression pattern matching for XML.Journal of Functional Programming, 6(13):961–1004, 2003.

[Imm82] Neil Immerman.Upper and lower bounds for first order expressibility.Journal of Computer and System Science, 25:76–98, 1982.

[JLH+07] Zhewei Jiang, Cheng Luo, Wen-Chi Hou, Qiang Zhu, andDunren Che.Efficient processing of xml twig pattern: A novel one-phaseholistic solution.In DEXA, pages 87–97, 2007.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 198: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

[Kep06] Stephan Kepser.Properties of binary transitive closure logics over trees.In 11th conference on Formal Grammar, 2006.

[Lib04] Leonid Libkin.Elements of Finite Model Theory.Springer Verlag, 2004.

[Lib06] Leonid Libkin.Logics over unranked trees: an overview.Logical Methods in Computer Science, 3(2):1–31, 2006.

[Mar04] Maarten Marx.Conditional XPath, the first order complete XPath dialect.In ACP SIGACT-SIGMOD-SIGART Symposium on Principlesof Database Systems, pages 13–22. ACM-Press, 2004.

[Mar05a] Maarten Marx.Conditional XPath.ACM Transactions on Database Systems, 30(4):929–959,2005.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 199: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

[Mar05b] Maarten Marx.First order paths in ordered trees.In International Conference on Database Theory, pages114–128, 2005.

[MLM01] M. Murata, D. Lee, and M. Mani.Taxonomy of XML schema languages using formal languagetheory.In Extreme Markup Languages, Montreal, Canada, 2001.

[MNS04] Wim Martens, Frank Neven, and Thomas Schwentick.Complexity of decision problems for simple regularexpressions.In Mathematical Foundations of Computer Science 2004, 29thInternational Symposium, pages 889–900, 2004.

[MNSB06] Wim Martens, Frank Neven, Thomas Schwentick, andGeert Jan Bex.Expressiveness and complexity of XML schema.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 200: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

ACM Transactions of Database Systems, 31(3):770–813,2006.

[Mor94] Etsuro Moriya.On two-way tree automata.Inf. Process. Lett., 50(3):117–121, 1994.

[MS02] Gerome Miklau and Dan Suciu.Containment and equivalence for an xpath fragment.In PODS, pages 65–76, 2002.

[MSS06] Anca Muscholl, Mathias Samuelides, and Luc Segoufin.Complementing deterministic tree-walking automata.Information Processing Letters, 99(1):33–39, 2006.

[Nev02a] Frank Neven.Automata, logic, and XML.In Computer Science Logic, Lecture Notes in ComputerScience, pages 2–26. Springer Verlag, 2002.

[Nev02b] Frank Neven.Automata theory for XML researchers.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 201: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

SIGMOD Rec., 31(3):39–46, 2002.

[NPTT05] Joachim Niehren, Laurent Planque, Jean-Marc Talbot, andSophie Tison.N-ary queries by tree automata.In 10th International Symposium on Database ProgrammingLanguages, volume 3774 of Lecture Notes in ComputerScience, pages 217–231. Springer Verlag, September 2005.

[NS99] Frank Neven and Thomas Schwentick.Query automata.In Proceedings of the Eighteenth ACM Symposium onPrinciples of Database Systems, pages 205–214, 1999.

[NS02] Frank Neven and Thomas Schwentick.Query automata over finite trees.Theoretical Computer Science, 275(1-2):633–674, 2002.

[Sch07] Thomas Schwentick.Automata for XML—a survey.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 202: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

Journal of Computer and System Science, 73(3):289–315,2007.

[Shi08] John W. Shipman.XSLT Reference.2008.

[Sto74] L. J. Stockmeyer.The Complexity of Decision Problems in Automata Theory.PhD thesis, Department of Electrical Engineering, MIT, 1974.

[tC06] Balder ten Cate.The expressiveness of XPath with transitive closure.In 25st ACM SIGMOD-SIGACT Symposium on Principles ofDatabase Systems. ACM-Press, 2006.

[tCM07] Balder ten Cate and Maarten Marx.Axiomatizing the logical core of XPath 2.0.In International Conference on Database Theory, 2007.

[tCS08] Balder ten Cate and Luc Segoufin.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 203: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

XPath, transitive closure logic, and nested tree walkingautomata.In 27th ACM SIGACT-SIGMOD-SIGART Symposium onPrinciples of Database Systems, 2008.

[TK06] Hans-Jorg Tiede and Stephan Kepser.Monadic second-order logic and transitive closure logics overtrees.In 13th Workshop on Logic, Language, Information andComputation, volume 165 of Electronical notes in theoreticalcomputer science, pages 189–199. Elsevier, 2006.

[TW68] J. W. Thatcher and J. B. Wright.Generalized finite automata with an application to a decisionproblem of second-order logic.Mathematical System Theory, 2:57–82, 1968.

[Var82] Moshe Y. Vardi.The complexity of relational query languages.In 14th ACM Symposium on Theory of Computing, pages137–146, 1982.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128

Page 204: Queries on TreesAutomata, logic, and XML [Nev02b, Nev02a] Automata for XML – a survey [Sch07] Effective Characterizations of Tree Logics [Boj08a] Treewalking automata [Boj08b] Books

[Var95] Moshe Y. Vardi.On the complexity of bounded-variable queries.In Fourteenth ACM SIGACT-SIGMOD-SIGART Symposiumon Principles of Database Systems, pages 266–276, 1995.

PhDs+S lawek (Mostrare) Queries on Trees 2008 128 / 128