www.kmonos.net
The Complexity of Tree Transducer Output Languages
NII Logic Seminar 2009/04/15
Kazuhiro Inaba @ NIIJoint work with Sebastian Maneth @ NICTA&UNSW
www.kmonos.net
2
Table of ContentsOverview
The Problem StatementKey Lemma: “Garbage-Free Form”Results and Applications
The Proof
(Other Topics in my Thesis…?)
www.kmonos.net
3
PreliminariesΣ, Δ, Γ, …: Ranked Finite Set
Each Symbol in Σ, Δ is associated with a natural number called “rank” of the symbol
TΣ = The set of Trees over ΣTΣ ::= σ(TΣ, …, TΣ) with rank(σ) = k
Example:Σ={a(2),b(1),c(0)}, a(b(c), c)∈ TΣ
www.kmonos.net
4
Preliminariesτ ⊆ TΣ×TΔ is called a (tree-to-tree)
translation
For τ1∈ TΣ×TΓ, τ2∈ TΓ×TΔ,the sequential composition is defined as follows: τ1 ; τ2 = {(s, t) | (s, r)∈τ1, (r, t)∈τ2 }
www.kmonos.net
“Complexity of Output Languages”
Given…A tree-to-tree translation τ ⊆ TΣ×TΔ
How complex is the set range(τ) ⊆ TΔ ?
(i.e., for a tree t ∈ TΔ, how is it computationally hard to determine whether t ∈range(τ) or not ?)
www.kmonos.net
Classic Results τ: Program of the Turing-Machine
Undecidable
τ: Nondeterministic Finite-State String Transduction range(τ) is regular!
The membership of range(τ) is solved in O(n) time, O(1) space
So for τ∈ Finitely Many Compositionsof Nondet-FST
b/bb
a/aa/b/b
www.kmonos.net
7
Our Result
What is Macro Tree Transducers? A relatively powerful (yet terminating)
model of tree translation The formal definition will be explained
soon…
τ: Composition of Nondeterministic Macro Tree Transducers the membership problem of range(τ)
is NP-complete and in DSPACE(n).
www.kmonos.net
8
Actually, we’ve shownthe “Garbage-Free Form”
For any composition τ = τ1;τ2; … ;τn ⊆ TΣ×TΔ of MTTs, there exists an equivalent one τ = ρ0 ; ρ1 ; … ; ρ2n ,such thatdom(ρ0)=TΣ and range(ρ0) is
regularFor any (s0,t)∈τ, there’s s1,..,s2n s.t.
(si,si+1) ∈ρi, (s2n,t)∈ρ2n, |si+1|≦ 2|si+2|, |s2n|≦2|t|
www.kmonos.net
9
Actually, we’ve shownthe “Garbage-Free Form”
In each step of translation (except the first step), the size of the tree always becomes larger(Ignoring the constant factor “2”)
τ1 τ2 τk
ts0 s1
s2Sk-1
ρ0 ρ1 ρ2n
ts1 s2 S2ns0
www.kmonos.net
10
(Possible)Applications in Practice
Compositions of MTTs are known to be a good model for XML translationsMTT* can represent good subclasses of
XML-QL, XSLT, XQuery, …
Query Optimization by GFFVerification by the range(MTT*)
membership
www.kmonos.net
11
Corollaries of GFF(Applications in Theory)
range(τ1;τ2; … ;τn ) is in NP and in DSPACE(n)
Higher-order context free languages are context-sensitive languagesDetails are in the next slide…
www.kmonos.net
12
Chomsky Hierarchy[Chomsky1959]
Systems of Finite Description of String Languages
Type-0 GrammarG=<N, Σ, S, R> where
S ∈ NR is a set of rewrite rules of the form
α → β with α,β ∈ (N∪Σ)*[G] = {w∈Σ* | S→*w}
www.kmonos.net
13
Chomsky Hierarchy and Complexity
Type-0 (Computable Language) Type-1 (Context-Sensitive Language)
αAβ → αγβ A∈N, γ≠empty Context-senstive iff recognizable in NSPACE(n)
[Landweber63, Kuroda64] Type-2 (Context-Free)
A→α Proper Subclass of PTIME [?]
Type-3 (Regular) A→sB or A→s A,B∈N, s∈Σ Regular iff recognizable in DSPACE(1) [Folklore]
www.kmonos.net
14
Context-Free Grammar= Level-0 Grammar
Example of Type-2 Grammar:S → 0 S 0, S→1 S 1, S→0, S→1Palindromes over 0s and 1s of odd
lengthA → α
Can be regarded as a nullary (nondeterministic) function whose output is strings
www.kmonos.net
15
Natural Extension:Macro Grammar
[Fischer68]= Level-1 GrammarS→A(0), A(x) → A(xx), A(x)→x
Sequence of 0s of length 2n
Nonterminals can be parameterized by strings, i.e., they’re nondeterministic function from strings to strings
[Aho68] Level-1 languages are context-sensitive(=NSPACE(n))
www.kmonos.net
16
Natural Extenstion:Higher-order grammar
[Damm82]Level-n Grammar is n-th order
grammar, in which the nonterminals are parameterized by at most (n-1)-th order entitiesB(f,x) → B(λy.f(f(y)), xx), B(f,x)→f(1x1)S → B(λy.yy, 0)
2n Repetition of “102^n1” = Simply-Typed λ Caluculus + Strings as
the Base Type + Nondeterminism
www.kmonos.net
17
Complexity?Are Level-2, Level-3, … languages
context-sensitive (=NSPACE(n)), or do they go beyond?
[Damm82] Decidable[Maneth02] Call-by-Value Level-n
Languages are in DSPACE(n)[This Work] Call-by-Name Level-n
Languages are in DSPACE(n)!
www.kmonos.net
18
Level-n Languages and MTTs [Damm82]
For any G : CbN Level-n grammar, there exists n+1 composition of CbN MTTs s.t. [G] = range(τ1;τ2; … ;τn )Intuition: MTTs can carry out
substitutionHence, giving a complexity
upperbound for range(MTT*) also gives the upperbound for Level-n languages
www.kmonos.net
Context-Free Language
Level-1 Language= Macro Language
Level-2 Language
Level-n Language
range(MTT*)
CFG where nonterminals are parameterized by other nonterminals[Aho68] ⊆NSPACE(n)[Rounds73] NP-complete
Context-Sensitive Language
[Kuroda64]= NSPACE(n)
[CYK, Earley, …] ⊆PTIME
Level-n Language [Damm82][Our Work]⊆ DSPACE(n)NP-complete
Level-3 Language
www.kmonos.net
20
Brief Introduction toMacro Tree Transducers (MTTs)
www.kmonos.net
MTT
An MTT M = (Q, start, Σ, Δ, R) is a set of first-order functions of type Tree(Σ) × Tree(Δ)k Tree(Δ)
Defined by mutual induction on the1st parameter (the input tree) Application in the right-hand side is restricted only to the
direct children of the current node Can take output tree fragments via other parameters, but
not allowed to inspect or decompose them
start( A(x1) ) → double( x1, double(x1, E) )
double( A(x1), y1) → double( x1, double(x1, y1) )double( B, y1 ) → F( y1, y1 )double( B, y1 ) → G( y1, y1 )
RHS ::= F( RHS, … , RHS ) | q(xi, RHS, …, RHS) | yi
Nondeterminism!
www.kmonos.net
22
Evaluation StrategyTwo Evaluation Strategies
(analogous to the λ-calculus…)Call-by-Value (IO : Inside-Out)Call-by-Name (OI : Outside-In)
www.kmonos.net
Example (IO / call-by-value)
start(A(B)) double( B, double(B, E) ) double( B, F(E, E) )
F( F(E,E), F(E,E) ) or G( F(E,E), F(E,E) ) or double( B, G(E, E) )
F( G(E, E), G(E, E) ) or G( G(E, E), G(E, E) )
double( A(x1), y1) → double( x1, double(x2, y1) )double( B, y1 ) → F( y1, y1 )double( B, y1 ) → G( y1, y1 )
www.kmonos.net
Example (OI / call-by-name)
start(A(B)) double( B, double(B, E) ) F( double(B, E), double(B, E) ) F( F(E,E), double(B, E) ) F( F(E,E), F(E,E) )
F( F(E,E), G(E,E) ) !! F( G(E,E), double(B, E) ) F( G(E,E), F(E,E) ) !!
F( G(E,E), G(E,E) ) G( double(B, E), double(B, E) ) G( F(E,E), double(B, E) ) G( F(E,E), F(E,E) )
G( F(E,E), G(E,E) ) !! G( G(E,E), double(B, E) ) G( G(E,E), F(E,E) ) !!
G( G(E,E), G(E,E) )
double( A(x1), y1) → double( x1, double(x2, y1) )double( B, y1 ) → F( y1, y1 )double( B, y1 ) → G( y1, y1 )
www.kmonos.net
25
Evaluation StrategyTwo Strategies
(analogous to the λ-calculus…)Call-by-Value (IO : Inside-Out)Call-by-Name (OI : Outside-In)
Today, we consider OI evaluation only.Why?
MTTIO* = MTTOI* (even though MTTIO ≠ MTTOI)
MTTOI has better compositionality (as shown later)
www.kmonos.net
26
Main Result:DSPACE(n) Membership of range(MTT*) ── or, how to apply the GFF to the range complexity
www.kmonos.net
≪Approach: Generate & Test≫ Guess the input s0and all the intermediate trees s1, …, sn-1 Check whether
(s,s1)∈τ1, (s1,s2)∈τ2, …, (sn-1, t) ∈τn If it is, then t is in the output language! Otherwise, try another s, s2, …, sn-1
τ1 τ2 τn
ts0 s1 s2 Sn-1∈ ?∈ ? ∈ ?
www.kmonos.net
In order to carry out the algorithm in DSPACE(|t|) … The sizes |s0|, |s1|, |s2|, …, |sn| must be linearly bounded
by |t| i.e., there must be a constant c independent from t
such that |s| ≦ c|t| Each step to test the “translation membership” of τi
must be done in DSPACE(n)
“Garbage Free-Form” assures this property!
≪Approach: Generate & Test ≫
τ1 τ2 τn
ts0 s1 s2 Sn-1∈ ?∈ ? ∈ ?
www.kmonos.net
29
[Review] “Garbage-Free Form”
For any composition τ = τ1;τ2; … ;τn ⊆ TΣ×TΔ of MTTs, there exists an equivalent one τ = ρ0 ; ρ1 ; … ; ρ2n ,such thatdom(ρ0)=TΣ and range(ρ0) is
regularFor any (s0,t)∈τ, there’s s1,..,s2n s.t.
(si,si+1) ∈ρi, (s2n,t)∈ρ2n, |si+1|≦ 2|si+2|, |s2n|≦2|t|
www.kmonos.net
30
[Review] “Garbage-Free Form”
In each step of translation (except the first step), the size of the tree always becomes larger(Ignoring the constant factor “2”)
τ1 τ2 τk
ts0 s1
s2Sk-1
ρ0 ρ1 ρ2n
ts1 s2 S2ns0
www.kmonos.net
31
“Translation Membership”Let τ be a MTT.
For trees s and t, can we check whether (s,t)∈τ or not within O(|s|+|t|) space?
The answer is Yes, but it is hard to prove it directly.Let us consider subclasses of MTTs…
www.kmonos.net
32
In Search of a Good Subclass…
[Engelfriet&Vogler85] MTT ⊆ T ; LMTT
Hence, range(MTT*) = range((T∪LMTT)*)T : MTTs without accumulating params.LMTT: “Linear” MTTs, each input
variable occurs at most once in rhs.T and LMTT is weak enough to show
the DSPACE(n) translation memshipBut, too weak to have the garbage-
free form
www.kmonos.net
33
Our Idea : Path-linear MTT Introduce a new
class “PLMTT” T∪LMTT ⊆ PLMTT
Still weak enough to show the DSPACE(n) translation memshp
Strong enough to have the GFF, as will be shown
“Path-linear” = on a path of nested application, each variable occurs linearly
Linear: f(x1, g(x2), h(x3))
Path-linear(but not linear): f(x1, g(x2), h(x2))
Not path-linear: f(x1, g(x1), h(x2))
www.kmonos.net
34
Linear Space“Translation Membership”
Lemma: Let τ ∈ PLMTT. For trees s and t, we can check whether (s, t)∈τ or not within O(|s|+|t|) space.
Proof: Basically, try all nondeterministic computation by backtracking. (Some clever stack-sharing is required)Path-linearity assures the length of the
backtracking stack is O(|s|).
www.kmonos.net
35
Key Theorem:the Garbage-Free Form of (PL)MTT*
www.kmonos.net
36
Garbage-Free = No-deletion In each step of translation (except
the first step), the size of the tree always becomes larger
τ1 τ2 τk
ts0 s1
s2Sk-1
ρ0 ρ1 ρ2n
ts1 s2 S2ns0
www.kmonos.net
Proof Sketch ofthe Garbage-Free Form
“Factor out” the deletion
τ1 ; τ2 == τ1 ; (D ; ρ2)
== (τ1 ; D) ; ρ2
== τ’1 ; ρ2
Decompose τ2
to ‘deleting part’ Dand ‘nondeleting’ τ’2
Associativity
Compose τ1 with D(Right-Compositionality
of OI MTTs)
www.kmonos.net
Three Types of Deletion “Input-Deletion”
E.g., f( A(x1, x2) ) B( f(x1) ) Discarding the “x2” subtree!
“Skipping” E.g., f( A(x1) ) f(x1) No output is generated at the unary node A.
“Erasure” E.g., f( L, y1 ) y1 (Mainly at leaf nodes) No new output symbol is
generated at the node.
www.kmonos.net
39
Three Types of Deletion If there is no input-deletion,
skipping, and erasure during the computation,|in| ≦ 2|out|
Intuition: No input-deletion visits all nodes No skipping outputs at least one node for each
input unary node No erasure outputs at least one node for each input
leaf node For any tree, 2*(#unary + #leaf) ≧ #nodes
(cf., the number of matches in a knockout tournament)
www.kmonos.net
40
How to eliminate“erasing” rules
Look-ahead + Inline-ExpansionDecompose τ into τ = E ; τne
E : Bottom-up translation that annotates each input tree with
τne: τ without erasing rules
www.kmonos.net
41
How to eliminate“erasing” rules : τ = E ; τneExample
A
B
C
B
C
C
τ: f( C, y1 ) → y1
g( B(x1,x2) ) → X( f(x1, Y) )
Applying f to the 1st child invokes
erasure…
A
B
C
B
C
CApplying f to the 1st
child invokes erasure…
E
www.kmonos.net
42
How to eliminate“erasing” rules : τ = E ; τne
τne: f( C, y1 ) → y1
g( B (x1,x2) ) → X( Y )Applying f to the 1st
child invokes erasure…
A
B
C
B
C
C
Applying f to the 1st child invokes
erasure…
A
B
C
B
C
CApplying f to the 1st
child invokes erasure…
E
www.kmonos.net
43
How to eliminate“erasing” rules : τ = E ; τneMore complex case
τ: f( C, y1 ) → y1
h( B(x1,x2), y1 ) → f(x1, y1)A
B
C
B
C
C
f(1st) erasureh(2nd) erasure
A
B
C
B
C
C f(1st) erasureE
h(1st) erasure
www.kmonos.net
44
Problem!“Inline-Expansion” is not always
possible for nondeterministic MTTsτ: f( C, y1, y2 ) y1
f( C, y1, y2 ) y2
g( B (x1,x2) ) h( x1, f(x1, Y, Z) )
h( C, y1 ) X( y1, y1 )τne: g( B (x1,x2) ) h( x1, Y ) g( B (x1,x2) ) h( x1, Z ) h( C, y1 ) X( y1, y1 )
er…
er…er…
DifferentTranslation!
www.kmonos.net
45
Solution!Extend MTTs with “inline-
nondeterminism”τ: f( C, y1, y2 ) y1
f( C, y1, y2 ) y2
g( B (x1,x2) ) h( x1, f(x1, Y, Z) )
h( C, y1 ) X( y1, y1 )
τne: g( B (x1,x2) ) h( x1, +(Y,Z) ) h( C, y1 ) X( y1, y1 )
er…
er…SameTranslation!
www.kmonos.net
Solution:“MTT with Choice and
Failure”MTT + inline-nondeterminism
+ inline-partiality
Same expressiveness: MTT = MTTCFMuch more flexible syntax
Allows inline-expansion for free!
RHS ::= F( RHS, … , RHS ) | q(xi, RHS, …, RHS) | yi
| +(RHS, RHS) | θ
www.kmonos.net
47
Note on Path-linearity Inline-Expansion
…does not preserve linearity.…does preserve path-linearity.
This is the reason for using PLMTTs.
τ: f( C, y1 ) X( y1, y1 ) g( B(x1,x2) ) f( x1, h(x2, Y) )
τne: g( B(x1,x2) ) X( h(x2, Y), h(x2, Y) )
www.kmonos.net
48
How to eliminate“Input-Deletion”: τne = I ; τnei
Exhaustively trying all deletion by using nondeterminism
A
B
C
B
C
CI
A0 A1
B00
A1
B10
CA0
B01
C
B11
C
or or or
or…
www.kmonos.net
49
How to eliminate“Input-Deletion”: τne = I ; τnei
Exampleτne: f( B(x1,x2) ) g( x1, g(x1, Y) ) f( B(x1,x2) ) g( x1, g(x2, Y) ) f( B(x1,x2) ) g( x2, g(x1, Y) )
τnei: f( B10(x1) ) g( x1, g(x1, Y) ) f( B10(x1) ) g( x1, θ ) f( B10(x1) ) g( x2, g(x1, Y) ) …
www.kmonos.net
50
How to eliminate“Skipping”: τnei = S ; τneis
Same as for the “input-deletion”Try all possible deletion
nondeterminisitically…A
B
C
BS
A
CBBCABB
BA
CB
oror or...
www.kmonos.net
[Review] Proof Sketch ofthe Garbage-Free Form
“Factor out” the deletion
τ1 ; τ2 == τ1 ; (E;I;S ; ρ2)
== (τ1 ; E;I;S) ; ρ2
== τ’1 ; ρ2
Decompose τ2
to ‘deleting part’ E;I;Sand ‘nondeleting’ τ’2
Associativity
Compose τ1 with D(Right-Compositionality of OI MTTs with linear top-down transducers)
www.kmonos.net
Summary Membership problem of range(MTT*) is in
DSPACE(n) Note: Almost the same proof shows that it is in
NP Note: NP-hardness was already known
[Rounds73]Proof is by
Garbage-Free FormFor GFF, we needed a class that is robust
w.r.t. the inline-expansionTranslation Membership
For TrnMem, we needed a class with a certain kind of linearity
“Path-linear MTT with Choice and Failure”!
www.kmonos.net
53
Related Workrange(T*) is in DSPACE(n)
[Baker1978]T : the class of translations realizable
by MTTs with no accumulating params.
range(DtMTT*) is in DSPACE(n) [Maneth2002]DtMTT : that of deterministic total
MTTs
www.kmonos.net
54
Other Topics in my Thesis PTIME “Translation Membership” for IO
MTTs “Multi-Return Macro Tree Transducer”
Nondet IO MTTs have poor compositionality. I’ve proposed a slight extension of MTTs with
better compositionality: DT ; MRMTT ; DtT ⊆ MRMTT
A proof that the translation called “twist” cannot be expressed by a single MTT No general techniques; tough work Solves several open conjectures on the
strictness of inclusion between two classes
Top Related