Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component...

55
Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet 1 Andreas Maletti 2 1 LIACS, , Leiden, The Netherlands 2 Institute of Computer Science, , Leipzig, Germany [email protected] Bordeaux, France September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 1

Transcript of Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component...

Page 1: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Multi Context-Free Tree Grammars andMulti-component Tree Adjoining Grammars

Joost Engelfriet1 Andreas Maletti2

1 LIACS, , Leiden, The Netherlands

2 Institute of Computer Science, , Leipzig, [email protected]

Bordeaux, France

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 1

Page 2: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

De�nitionContext-free grammar (N,Σ, S, R) is in Greibach normal formif each rule ρ ∈ R \ {S → ε} is of the form ρ = A→ σA1 · · ·Anwith σ ∈ Σ and A,A1, . . . ,An ∈ N

Theorem [Greibach 1965]Every CFG can be turned into an equivalent CFG in Greibach normal form

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 2

Page 3: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

De�nitionContext-free grammar (N,Σ, S, R) is in Greibach normal formif each rule ρ ∈ R \ {S → ε} is of the form ρ = A→ σA1 · · ·Anwith σ ∈ Σ and A,A1, . . . ,An ∈ N

Theorem [Greibach 1965]Every CFG can be turned into an equivalent CFG in Greibach normal form

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 2

Page 4: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

De�nitionCFG (N,Σ, S, R) is lexicalized if occΣ(r) 6= ∅for each rule (A→ r) ∈ R \ {S → ε}

CFG in Greibach normal form is lexicalized

lexicographers (linguists) love lexicalized grammars

occurrence of lexical element in a rule is called anchor

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 3

Page 5: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

De�nitionCFG (N,Σ, S, R) is lexicalized if occΣ(r) 6= ∅for each rule (A→ r) ∈ R \ {S → ε}

CFG in Greibach normal form is lexicalized

lexicographers (linguists) love lexicalized grammars

occurrence of lexical element in a rule is called anchor

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 3

Page 6: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

De�nitionCFG (N,Σ, S, R) is lexicalized if occΣ(r) 6= ∅for each rule (A→ r) ∈ R \ {S → ε}

CFG in Greibach normal form is lexicalized

lexicographers (linguists) love lexicalized grammars

occurrence of lexical element in a rule is called anchor

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 3

Page 7: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

S

NP

PRP

We

VP

MD

must

VP

VB

bear

PP

IN

in

NP

NN

mind

NP

NP

DT

the

NN

Community

PP

IN

as

NP

DT

a

NN

whole

linguists nowadays care more about the parse treethan the membership of its yield in the (string) language

modern grammar formalisms generate tree and string languages

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 4

Page 8: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

De�nitionFor two tree grammars G and G ′, of which G ′ is lexicalized,

G ′ weakly lexicalizes G if yield(L(G ′)) = yield(L(G))

G ′ strongly lexicalizes G if L(G ′) = L(G)

tree language preserved under strong lexicalization

string language preserved under weak lexicalization

li�ed to classes C and C′ as usualC′-grammars strongly lexicalize C-grammars if for every G ∈ Cthere exists a lexicalized G ′ ∈ C′ such that L(G ′) = L(G)

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 5

Page 9: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

De�nitionFor two tree grammars G and G ′, of which G ′ is lexicalized,

G ′ weakly lexicalizes G if yield(L(G ′)) = yield(L(G))

G ′ strongly lexicalizes G if L(G ′) = L(G)

tree language preserved under strong lexicalization

string language preserved under weak lexicalization

li�ed to classes C and C′ as usualC′-grammars strongly lexicalize C-grammars if for every G ∈ Cthere exists a lexicalized G ′ ∈ C′ such that L(G ′) = L(G)

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 5

Page 10: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

De�nitionFor two tree grammars G and G ′, of which G ′ is lexicalized,

G ′ weakly lexicalizes G if yield(L(G ′)) = yield(L(G))

G ′ strongly lexicalizes G if L(G ′) = L(G)

tree language preserved under strong lexicalization

string language preserved under weak lexicalization

li�ed to classes C and C′ as usualC′-grammars strongly lexicalize C-grammars if for every G ∈ Cthere exists a lexicalized G ′ ∈ C′ such that L(G ′) = L(G)

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 5

Page 11: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

Some results:

CFGs (local tree grammars) weakly lexicalize themselves[Greibach 1965]

Tree adjoining grammars (TAGs) strongly lexicalize CFGs[Joshi, Schabes 1997]

TAGs strongly lexicalize themselves[Joshi, Schabes 1997]

TAGs do not strongly lexicalize themselves[Kuhlmann, Satta 2012]

Context-free tree grammars (CFTGs)strongly lexicalize TAGs and themselves[Maletti, Engelfriet 2013]

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 6

Page 12: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

Some results:

CFGs (local tree grammars) weakly lexicalize themselves[Greibach 1965]

Tree adjoining grammars (TAGs) strongly lexicalize CFGs[Joshi, Schabes 1997]

TAGs strongly lexicalize themselves[Joshi, Schabes 1997]

TAGs do not strongly lexicalize themselves[Kuhlmann, Satta 2012]

Context-free tree grammars (CFTGs)strongly lexicalize TAGs and themselves[Maletti, Engelfriet 2013]

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 6

Page 13: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

Some results:

CFGs (local tree grammars) weakly lexicalize themselves[Greibach 1965]

Tree adjoining grammars (TAGs) strongly lexicalize CFGs[Joshi, Schabes 1997]

TAGs strongly lexicalize themselves[Joshi, Schabes 1997]

TAGs do not strongly lexicalize themselves[Kuhlmann, Satta 2012]

Context-free tree grammars (CFTGs)strongly lexicalize TAGs and themselves[Maletti, Engelfriet 2013]

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 6

Page 14: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Motivation

Some results:

CFGs (local tree grammars) weakly lexicalize themselves[Greibach 1965]

Tree adjoining grammars (TAGs) strongly lexicalize CFGs[Joshi, Schabes 1997]

TAGs strongly lexicalize themselves[Joshi, Schabes 1997]

TAGs do not strongly lexicalize themselves[Kuhlmann, Satta 2012]

Context-free tree grammars (CFTGs)strongly lexicalize TAGs and themselves[Maletti, Engelfriet 2013]

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 6

Page 15: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Contents

1 Motivation

2 Main notion

3 Lexicalization

4 Expressive Power

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 7

Page 16: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

De�nition [Engelfriet, Maneth 1998; Kanazawa 2010]Multiple context-free tree grammar (MCFTG) G = (N, B,Σ, S, R)

�nite totally ordered ranked alphabet N (nonterminals)

partition B ⊆ P(N) of N (big nonterminals)

�nite ranked alphabet Σ (terminals)

S ∈ N(0) with {S} ∈ B (initial big nonterminal)

�nite set R of rules of the form A→ r with A ∈ B and N-linear forestr ∈ CN∪Σ(X )+ such that rk+(r) = rk+(A) and B saturates occN(r)

MCFTGs generalize (linear, nondeleting) CFTGs to multiple components

multiple components synchronously applied to“synchronized” nonterminal occurrences

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 8

Page 17: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

De�nition [Engelfriet, Maneth 1998; Kanazawa 2010]Multiple context-free tree grammar (MCFTG) G = (N, B,Σ, S, R)

�nite totally ordered ranked alphabet N (nonterminals)

partition B ⊆ P(N) of N (big nonterminals)

�nite ranked alphabet Σ (terminals)

S ∈ N(0) with {S} ∈ B (initial big nonterminal)

�nite set R of rules of the form A→ r with A ∈ B and N-linear forestr ∈ CN∪Σ(X )+ such that rk+(r) = rk+(A) and B saturates occN(r)

MCFTGs generalize (linear, nondeleting) CFTGs to multiple components

multiple components synchronously applied to“synchronized” nonterminal occurrences

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 8

Page 18: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

Nonterminals S , A, C , C ′, T1, T2, T3:

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

(nonterminals that constitute a big nonterminal connected by splines)

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 9

Page 19: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

nonterminals T1, T2, T3 with T1 < T2 < T3, terminals {γ, τ, σ, α, ν}big nonterminal in lhs and rhs: {T1, T2, T3} of ranks 1,0,03 corresponding rhs contexts with 1,0,0 variables

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 10

Page 20: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

Derivation:

A ⇒

T1

σ

C

T2

T3⇒

T1

σ

σ

C

T2

C ′

A

T3 ⇒

T1

σ

σ

T2 C ′

A

T3 ⇒

γ

T1

τ

σ

σ

σ

T2 α

C ′

A

ν

T3

γ

τ

σ

σ

σ

α α

C ′

A

ν

β

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 11

Page 21: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

Derivation:

A ⇒

T1

σ

C

T2

T3⇒

T1

σ

σ

C

T2

C ′

A

T3 ⇒

T1

σ

σ

T2 C ′

A

T3 ⇒

γ

T1

τ

σ

σ

σ

T2 α

C ′

A

ν

T3

γ

τ

σ

σ

σ

α α

C ′

A

ν

β

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 11

Page 22: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

Derivation:

A ⇒

T1

σ

C

T2

T3⇒

T1

σ

σ

C

T2

C ′

A

T3 ⇒

T1

σ

σ

T2 C ′

A

T3 ⇒

γ

T1

τ

σ

σ

σ

T2 α

C ′

A

ν

T3

γ

τ

σ

σ

σ

α α

C ′

A

ν

β

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 11

Page 23: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

Derivation:

A ⇒

T1

σ

C

T2

T3⇒

T1

σ

σ

C

T2

C ′

A

T3 ⇒

T1

σ

σ

T2 C ′

A

T3 ⇒

γ

T1

τ

σ

σ

σ

T2 α

C ′

A

ν

T3

γ

τ

σ

σ

σ

α α

C ′

A

ν

β

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 11

Page 24: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

Derivation:

A ⇒

T1

σ

C

T2

T3⇒

T1

σ

σ

C

T2

C ′

A

T3 ⇒

T1

σ

σ

T2 C ′

A

T3 ⇒

γ

T1

τ

σ

σ

σ

T2 α

C ′

A

ν

T3

γ

τ

σ

σ

σ

α α

C ′

A

ν

β

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 11

Page 25: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

Derivation:

A ⇒

T1

σ

C

T2

T3⇒

T1

σ

σ

C

T2

C ′

A

T3 ⇒

T1

σ

σ

T2 C ′

A

T3 ⇒

γ

T1

τ

σ

σ

σ

T2 α

C ′

A

ν

T3

γ

τ

σ

σ

σ

α α

C ′

A

ν

β

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 11

Page 26: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

Derivation:

A ⇒

T1

σ

C

T2

T3⇒

T1

σ

σ

C

T2

C ′

A

T3 ⇒

T1

σ

σ

T2 C ′

A

T3 ⇒

γ

T1

τ

σ

σ

σ

T2 α

C ′

A

ν

T3

γ

τ

σ

σ

σ

α α

C ′

A

ν

β

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 11

Page 27: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

Derivation:

A ⇒

T1

σ

C

T2

T3⇒

T1

σ

σ

C

T2

C ′

A

T3 ⇒

T1

σ

σ

T2 C ′

A

T3 ⇒

γ

T1

τ

σ

σ

σ

T2 α

C ′

A

ν

T3

γ

τ

σ

σ

σ

α α

C ′

A

ν

β

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 11

Page 28: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

Derivation:

A ⇒

T1

σ

C

T2

T3⇒

T1

σ

σ

C

T2

C ′

A

T3 ⇒

T1

σ

σ

T2 C ′

A

T3 ⇒

γ

T1

τ

σ

σ

σ

T2 α

C ′

A

ν

T3

γ

τ

σ

σ

σ

α α

C ′

A

ν

β

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 11

Page 29: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

Derivation:

A ⇒

T1

σ

C

T2

T3⇒

T1

σ

σ

C

T2

C ′

A

T3 ⇒

T1

σ

σ

T2 C ′

A

T3 ⇒

γ

T1

τ

σ

σ

σ

T2 α

C ′

A

ν

T3

γ

τ

σ

σ

σ

α α

C ′

A

ν

β

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 11

Page 30: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Main notion

De�nitionThe tree language generated by the MCFTG G = (N, B,Σ, S, R) is

L(G) = {t ∈ TΣ | S ⇒∗ t}

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 12

Page 31: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Contents

1 Motivation

2 Main notion

3 Lexicalization

4 Expressive Power

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 13

Page 32: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

De�nition

Tree language L ⊆ TΣ has �nite ambiguity if for every w ∈ (Σ(0))∗

{t ∈ L | yield(t) = w} is �nite

every string w has �nitely many “parses” in L(i.e., �nitely many tree representations that have w as yield)

property of the language, not the grammar(not to be confused with the similarly named notions for grammars)

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 14

Page 33: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

De�nition

Tree language L ⊆ TΣ has �nite ambiguity if for every w ∈ (Σ(0))∗

{t ∈ L | yield(t) = w} is �nite

every string w has �nitely many “parses” in L(i.e., �nitely many tree representations that have w as yield)

property of the language, not the grammar(not to be confused with the similarly named notions for grammars)

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 14

Page 34: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

MCFTG G :

A →

T1

σ

C

T2

T3

C

x1→

σ

C

x1

C ′

A

C ′

x1→

σ

C

x1

C ′

A

T1

x1T2 T3 →

γ

T1

τ

x1

σ

T2 α

ν

T3

S →γ

A

C

x1→ x1

C ′

x1→ x1

T1

x1T2 T3 → x1 α β

L(G) has �nite ambiguity

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 15

Page 35: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

De�nition

MCFTG (N, B,Σ, S, R) is lexicalized if occΣ(0)(r) 6= ∅ for every A→ r ∈ R

each rule contains an anchor (from Σ(0))

lexicalized MCFTGs generate tree languages with �nite ambiguity

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 16

Page 36: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

De�nition

MCFTG (N, B,Σ, S, R) is lexicalized if occΣ(0)(r) 6= ∅ for every A→ r ∈ R

each rule contains an anchor (from Σ(0))

lexicalized MCFTGs generate tree languages with �nite ambiguity

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 16

Page 37: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

Theorem [MCFTGs strongly lexicalize themselves]For every MCFTG G it is decidable whether L(G) has �nite ambiguity

andif so an equivalent lexicalized MCFTG can be constructed.

multiplicity remains the same(multiplicity = maximal cardinality of big nonterminals)

width increases at most by 1(width = maximal rank of nonterminals)

derivation trees are even related by means of linear deterministictop-down tree transducers with regular look-ahead

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 17

Page 38: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

Theorem [MCFTGs strongly lexicalize themselves]For every MCFTG G it is decidable whether L(G) has �nite ambiguity andif so an equivalent lexicalized MCFTG can be constructed.

multiplicity remains the same(multiplicity = maximal cardinality of big nonterminals)

width increases at most by 1(width = maximal rank of nonterminals)

derivation trees are even related by means of linear deterministictop-down tree transducers with regular look-ahead

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 17

Page 39: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

Theorem [MCFTGs strongly lexicalize themselves]For every MCFTG G it is decidable whether L(G) has �nite ambiguity andif so an equivalent lexicalized MCFTG can be constructed.

multiplicity remains the same(multiplicity = maximal cardinality of big nonterminals)

width increases at most by 1(width = maximal rank of nonterminals)

derivation trees are even related by means of linear deterministictop-down tree transducers with regular look-ahead

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 17

Page 40: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

Lexicalization approach:

normalize terminal rules to contain at least 2 anchors

normalize unary rules to contain at least 1 anchor

guess-and-verify strategy for remaining rules

Derivation tree (of another MCFTG): ρ1

ρ2

ρ4

ρ6 ρ0 ρ′4

ρ4

ρ5

ρ8

ρ′5

ρ8

ρ0 ρ9

ρ′4

ρ6 ρ′6 ρ0 ρ9

ρ9

ρ2

ρ6 ρ8

ρ7

ρ0 ρ8β

β

α

α

α

α

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 18

Page 41: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

Lexicalization approach:

normalize terminal rules to contain at least 2 anchors

normalize unary rules to contain at least 1 anchor

guess-and-verify strategy for remaining rules

Derivation tree (of another MCFTG): ρ1

ρ2

ρ4

ρ6 ρ0 ρ′4

ρ4

ρ5

ρ8

ρ′5

ρ8

ρ0 ρ9

ρ′4

ρ6 ρ′6 ρ0 ρ9

ρ9

ρ2

ρ6 ρ8

ρ7

ρ0 ρ8β

β

α

α

α

α

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 18

Page 42: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

Lexicalization approach:

normalize terminal rules to contain at least 2 anchors

normalize unary rules to contain at least 1 anchor

guess-and-verify strategy for remaining rules

Derivation tree (of another MCFTG): ρ1

ρ2

ρ4

ρ6 ρ0 ρ′4

ρ4

ρ5

ρ8

ρ′5

ρ8

ρ0 ρ9

ρ′4

ρ6 ρ′6 ρ0 ρ9

ρ9

ρ2

ρ6 ρ8

ρ7

ρ0 ρ8β

β

α

α

α

α

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 18

Page 43: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

Extraction (veri�cation) of lexical symbol

Original rule:

T1

x1T2 T3 → x1 α β

Constructed rule:

T1

x1

Tα2

x1T3 → x1 x1 β

Guess of lexical symbol (lexicalizing the rule)

Original rule: Constructed rule:

A →

T1

σ

C

T2

T3A →

T1

σ

C

Tα2

α

T3

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 19

Page 44: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Lexicalization

Extraction (veri�cation) of lexical symbol

Original rule:

T1

x1T2 T3 → x1 α β

Constructed rule:

T1

x1

Tα2

x1T3 → x1 x1 β

Guess of lexical symbol (lexicalizing the rule)

Original rule: Constructed rule:

A →

T1

σ

C

T2

T3A →

T1

σ

C

Tα2

α

T3

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 19

Page 45: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Contents

1 Motivation

2 Main notion

3 Lexicalization

4 Expressive Power

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 20

Page 46: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Expressive Power

De�nitionContext c ∈ CN∪Σ(Xk) with k variables is footedif k = 0 or there is a subtree of the form σ(x1, . . . , xk)

Rule A→ r is footed if all contexts in r are footed

MCFTG (N, B,Σ, S, R) is a multi-component tree adjoining grammar(MC-TAG) if all the rules of R are footed.

Non-footed rule: Footed rule:

A

x1 x2→

σ

γ

σ

x1 α

x2 A

x1 x2→

σ

γ

σ

x1 x2

α

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 21

Page 47: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Expressive Power

De�nitionContext c ∈ CN∪Σ(Xk) with k variables is footedif k = 0 or there is a subtree of the form σ(x1, . . . , xk)

Rule A→ r is footed if all contexts in r are footed

MCFTG (N, B,Σ, S, R) is a multi-component tree adjoining grammar(MC-TAG) if all the rules of R are footed.

Non-footed rule: Footed rule:

A

x1 x2→

σ

γ

σ

x1 α

x2 A

x1 x2→

σ

γ

σ

x1 x2

α

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 21

Page 48: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Expressive Power

De�nitionContext c ∈ CN∪Σ(Xk) with k variables is footedif k = 0 or there is a subtree of the form σ(x1, . . . , xk)

Rule A→ r is footed if all contexts in r are footed

MCFTG (N, B,Σ, S, R) is a multi-component tree adjoining grammar(MC-TAG) if all the rules of R are footed.

Non-footed rule: Footed rule:

A

x1 x2→

σ

γ

σ

x1 α

x2 A

x1 x2→

σ

γ

σ

x1 x2

α

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 21

Page 49: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Expressive Power

TheoremFor every MCFTG G there exists an equivalent MC-TAG G ′

footed normal form for MCFTGs

footed CFTGs as expressive as TAGs [Kepser, Rogers 2011]

result also true for strict MC-TAG(our notion of MC-TAG is essentially “non-strict MC-TAG”)

if MCFTG G lexicalized, then so is MC-TAG G ′

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 22

Page 50: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Expressive Power

TheoremFor every MCFTG G there exists an equivalent MC-TAG G ′

footed normal form for MCFTGs

footed CFTGs as expressive as TAGs [Kepser, Rogers 2011]

result also true for strict MC-TAG(our notion of MC-TAG is essentially “non-strict MC-TAG”)

if MCFTG G lexicalized, then so is MC-TAG G ′

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 22

Page 51: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Expressive Power

Proof idea:

Decompose context into footed contexts:

Original rule:

C

x1 x2→

γ

γ

σ

α x1 τ

τ

x2

C τ

Constructed rule:

x1

x1 x2 x3

Cα C τ

x1

γ

γ

x1

σ

x1 x2 x3α

τ

τ

x1

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 23

Page 52: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Expressive Power

Proof idea:

Adjust “calls” appropriately:

Original rhs of rule: Constructed rhs of rule:

γ

A

C

β τ

x1

γ

A

Cα β C τ

τ

x1

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 24

Page 53: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Expressive Power

Corollary [MC-TAGs strongly lexicalize themselves]For every MC-TAG G it is decidable whether L(G) has �nite ambiguity andif so an equivalent lexicalized MC-TAG can be constructed.

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 25

Page 54: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Key points:

MCFTGs and MC-TAGs equally expressive

both allow strong lexicalization

Thank you for your attention.

Full version available on arXiv

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 26

Page 55: Multi Context-Free Tree Grammars and Multi …Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Joost Engelfriet1 Andreas Maletti2 1 LIACS, , Leiden, The

Key points:

MCFTGs and MC-TAGs equally expressive

both allow strong lexicalization

Thank you for your attention.

Full version available on arXiv

September 11, 2017 MCFTG and MC-TAG J. Engelfriet, A. Maletti · 26