Parsing - Context-Free Grammars (CFG)3 CFG normal forms Chomsky Normal Form Greibach Normal Form 2...
Transcript of Parsing - Context-Free Grammars (CFG)3 CFG normal forms Chomsky Normal Form Greibach Normal Form 2...
ParsingContext-Free Grammars (CFG)
Laura Kallmeyer
Heinrich-Heine-Universitat Dusseldorf
Summer 2019
1 / 32
Table of contents
1 Context-Free Grammars
2 Simplifying CFGsRemoving useless symbolsEliminating ε-rulesRemoving unary rules
3 CFG normal formsChomsky Normal FormGreibach Normal Form
2
Grammar Formalisms (again)
Type 1/2/3 grammarsA type 0 grammar is called
context-sensitive (or of type 1) if for all productions α → β,|α| ≤ |β| holds. �e only exception is S → ε which is allowedif S does not appear in any righthand side.
context-free (or of type 2) if for all productions α → β, α ∈N .regular (or of type 3) if for all productions α→ β, α ∈ N andβ ∈ T∗ or β = β′X with β′ ∈ T∗, X ∈ N .
�e type 1/2/3 languages are the languages generated by thecorresponding grammars.
�e hierarchy of the type 0, 1, 2 and 3 languages is called theChomsky Hierarchy.
3
Grammar Formalisms (again)
Type 1/2/3 grammarsA type 0 grammar is called
context-sensitive (or of type 1) if for all productions α → β,|α| ≤ |β| holds. �e only exception is S → ε which is allowedif S does not appear in any righthand side.context-free (or of type 2) if for all productions α → β, α ∈N .
regular (or of type 3) if for all productions α→ β, α ∈ N andβ ∈ T∗ or β = β′X with β′ ∈ T∗, X ∈ N .
�e type 1/2/3 languages are the languages generated by thecorresponding grammars.
�e hierarchy of the type 0, 1, 2 and 3 languages is called theChomsky Hierarchy.
3
Grammar Formalisms (again)
Type 1/2/3 grammarsA type 0 grammar is called
context-sensitive (or of type 1) if for all productions α → β,|α| ≤ |β| holds. �e only exception is S → ε which is allowedif S does not appear in any righthand side.context-free (or of type 2) if for all productions α → β, α ∈N .regular (or of type 3) if for all productions α→ β, α ∈ N andβ ∈ T∗ or β = β′X with β′ ∈ T∗, X ∈ N .
�e type 1/2/3 languages are the languages generated by thecorresponding grammars.
�e hierarchy of the type 0, 1, 2 and 3 languages is called theChomsky Hierarchy.
3
Grammar Formalisms (again)
Type 1/2/3 grammarsA type 0 grammar is called
context-sensitive (or of type 1) if for all productions α → β,|α| ≤ |β| holds. �e only exception is S → ε which is allowedif S does not appear in any righthand side.context-free (or of type 2) if for all productions α → β, α ∈N .regular (or of type 3) if for all productions α→ β, α ∈ N andβ ∈ T∗ or β = β′X with β′ ∈ T∗, X ∈ N .
�e type 1/2/3 languages are the languages generated by thecorresponding grammars.
�e hierarchy of the type 0, 1, 2 and 3 languages is called theChomsky Hierarchy.
3
Grammar Formalisms (again)
Type 1/2/3 grammarsA type 0 grammar is called
context-sensitive (or of type 1) if for all productions α → β,|α| ≤ |β| holds. �e only exception is S → ε which is allowedif S does not appear in any righthand side.context-free (or of type 2) if for all productions α → β, α ∈N .regular (or of type 3) if for all productions α→ β, α ∈ N andβ ∈ T∗ or β = β′X with β′ ∈ T∗, X ∈ N .
�e type 1/2/3 languages are the languages generated by thecorresponding grammars.
�e hierarchy of the type 0, 1, 2 and 3 languages is called theChomsky Hierarchy.
3
Grammar Formalisms (again)
Type 3 grammarGrammar for L(das (rote|grune)∗ Auto (von O�o| ε))
NP→ Det N1 N1→ rote N1 N1→ grune N1 N1→ AutoN1→ Auto PP PP→ von N2 N2→ O�o
Type 2 grammarGrammar for {anbm(cd)nd | n,m ≥ 0}S→ T d T→ a T cd T→ U U→ b U U→ ε
Type 1 grammar
Grammar for {ww |w ∈ {a, b}+}
S→ a S A S→ b S B S→ Af A S→ Bf Ba A→ A a b A→ A b a B→ B a b B→ B bAf A→ Af a Bf B→ Bf b Af B→ Af b Bf A→ Bf aAf → a Bf → b
4
Grammar Formalisms (again)
Type 3 grammarGrammar for L(das (rote|grune)∗ Auto (von O�o| ε))
NP→ Det N1 N1→ rote N1 N1→ grune N1 N1→ AutoN1→ Auto PP PP→ von N2 N2→ O�o
Type 2 grammarGrammar for {anbm(cd)nd | n,m ≥ 0}S→ T d T→ a T cd T→ U U→ b U U→ ε
Type 1 grammar
Grammar for {ww |w ∈ {a, b}+}
S→ a S A S→ b S B S→ Af A S→ Bf Ba A→ A a b A→ A b a B→ B a b B→ B bAf A→ Af a Bf B→ Bf b Af B→ Af b Bf A→ Bf aAf → a Bf → b
4
Grammar Formalisms (again)
Type 3 grammarGrammar for L(das (rote|grune)∗ Auto (von O�o| ε))
NP→ Det N1 N1→ rote N1 N1→ grune N1 N1→ AutoN1→ Auto PP PP→ von N2 N2→ O�o
Type 2 grammarGrammar for {anbm(cd)nd | n,m ≥ 0}S→ T d T→ a T cd T→ U U→ b U U→ ε
Type 1 grammar
Grammar for {ww |w ∈ {a, b}+}
S→ a S A S→ b S B S→ Af A S→ Bf Ba A→ A a b A→ A b a B→ B a b B→ B bAf A→ Af a Bf B→ Bf b Af B→ Af b Bf A→ Bf aAf → a Bf → b
4
CFG
CFGA context-free grammar (CFG) is a tuple G = 〈N , T , P, S〉 suchthat:
N and T are disjoint alphabets, the nonterminals and terminalsS ∈ N is the start symbolP is a set of productions of the form A → β with A ∈ N , β ∈(N ∪ T)∗
Any β ∈ (N ∪ T)∗ with S ∗⇒ β is called a sentential form.
5
CFG
CFGA context-free grammar (CFG) is a tuple G = 〈N , T , P, S〉 suchthat:
N and T are disjoint alphabets, the nonterminals and terminalsS ∈ N is the start symbolP is a set of productions of the form A → β with A ∈ N , β ∈(N ∪ T)∗
Any β ∈ (N ∪ T)∗ with S ∗⇒ β is called a sentential form.
5
CFG
CFG parse treeA tree t is a parse tree for a CFG G = 〈N , T , P, S〉 i�
each node in t is labeled with an α ∈ N ∪ T ∪ {ε}the root label is Sif there is a node with label A that has n daughters labeled(from le� to right) α1, . . . , αn, then A→ α1 . . . αn ∈ P
if a node has label ε, it is a leaf and the unique daughter of itsmother node
S ∗⇒ β in G i� there is a parse tree for G with yield β.
6
CFG
CFG parse treeA tree t is a parse tree for a CFG G = 〈N , T , P, S〉 i�
each node in t is labeled with an α ∈ N ∪ T ∪ {ε}the root label is Sif there is a node with label A that has n daughters labeled(from le� to right) α1, . . . , αn, then A→ α1 . . . αn ∈ P
if a node has label ε, it is a leaf and the unique daughter of itsmother node
S ∗⇒ β in G i� there is a parse tree for G with yield β.
6
CFG
Languages generated by a CFGLet G = 〈N , T , P, S〉 be a CFG
�e tree language is the set of all parse trees with root label Sand all leaves labelled with a ∈ T ∪ {ε}.�e string language L(G) of G is the set {w ∈ T∗ | S ∗⇒ w}where
1 for w,w′ ∈ (N ∪ T)∗: w ⇒ w′ i� there is a A→ β ∈ P and thereare v, u ∈ (N ∪ T)∗ such that w = vAu and w′ = vβu.
2∗⇒ is the re�exive transitive closure of⇒.
A derivation of a word w ∈ T∗ is a sequence
S ⇒ α1 ⇒ α2 ⇒ · · · ⇒ w
of derivation steps leading to w.
7
CFG
Languages generated by a CFGLet G = 〈N , T , P, S〉 be a CFG
�e tree language is the set of all parse trees with root label Sand all leaves labelled with a ∈ T ∪ {ε}.
�e string language L(G) of G is the set {w ∈ T∗ | S ∗⇒ w}where
1 for w,w′ ∈ (N ∪ T)∗: w ⇒ w′ i� there is a A→ β ∈ P and thereare v, u ∈ (N ∪ T)∗ such that w = vAu and w′ = vβu.
2∗⇒ is the re�exive transitive closure of⇒.
A derivation of a word w ∈ T∗ is a sequence
S ⇒ α1 ⇒ α2 ⇒ · · · ⇒ w
of derivation steps leading to w.
7
CFG
Languages generated by a CFGLet G = 〈N , T , P, S〉 be a CFG
�e tree language is the set of all parse trees with root label Sand all leaves labelled with a ∈ T ∪ {ε}.�e string language L(G) of G is the set {w ∈ T∗ | S ∗⇒ w}where
1 for w,w′ ∈ (N ∪ T)∗: w ⇒ w′ i� there is a A→ β ∈ P and thereare v, u ∈ (N ∪ T)∗ such that w = vAu and w′ = vβu.
2∗⇒ is the re�exive transitive closure of⇒.
A derivation of a word w ∈ T∗ is a sequence
S ⇒ α1 ⇒ α2 ⇒ · · · ⇒ w
of derivation steps leading to w.
7
CFG
Languages generated by a CFGLet G = 〈N , T , P, S〉 be a CFG
�e tree language is the set of all parse trees with root label Sand all leaves labelled with a ∈ T ∪ {ε}.�e string language L(G) of G is the set {w ∈ T∗ | S ∗⇒ w}where
1 for w,w′ ∈ (N ∪ T)∗: w ⇒ w′ i� there is a A→ β ∈ P and thereare v, u ∈ (N ∪ T)∗ such that w = vAu and w′ = vβu.
2∗⇒ is the re�exive transitive closure of⇒.
A derivation of a word w ∈ T∗ is a sequence
S ⇒ α1 ⇒ α2 ⇒ · · · ⇒ w
of derivation steps leading to w.
7
CFG
Languages generated by a CFGLet G = 〈N , T , P, S〉 be a CFG
�e tree language is the set of all parse trees with root label Sand all leaves labelled with a ∈ T ∪ {ε}.�e string language L(G) of G is the set {w ∈ T∗ | S ∗⇒ w}where
1 for w,w′ ∈ (N ∪ T)∗: w ⇒ w′ i� there is a A→ β ∈ P and thereare v, u ∈ (N ∪ T)∗ such that w = vAu and w′ = vβu.
2∗⇒ is the re�exive transitive closure of⇒.
A derivation of a word w ∈ T∗ is a sequence
S ⇒ α1 ⇒ α2 ⇒ · · · ⇒ w
of derivation steps leading to w.
7
CFG
Languages generated by a CFGLet G = 〈N , T , P, S〉 be a CFG
�e tree language is the set of all parse trees with root label Sand all leaves labelled with a ∈ T ∪ {ε}.�e string language L(G) of G is the set {w ∈ T∗ | S ∗⇒ w}where
1 for w,w′ ∈ (N ∪ T)∗: w ⇒ w′ i� there is a A→ β ∈ P and thereare v, u ∈ (N ∪ T)∗ such that w = vAu and w′ = vβu.
2∗⇒ is the re�exive transitive closure of⇒.
A derivation of a word w ∈ T∗ is a sequence
S ⇒ α1 ⇒ α2 ⇒ · · · ⇒ w
of derivation steps leading to w.
7
CFG
For a single parse tree, there might be more than one correspondingderivation.
DerivationsCFG Ga,b = 〈{S,A,B}, {a, b}, P, S〉 with productions
S → AB |BA A→ a | aS | bAA B→ b | bS | aBB
(�is CFG generates the language {w ∈ {a, b}+ | |w|a = |w|b}.)
Input w = ab.
�e two derivations for w areS ⇒ AB⇒ aB⇒ ab and S ⇒ AB⇒ Ab⇒ ab
(|w|a gives the number of occurrences of a in w.)
8
CFG
Le�most and rightmost derivationsA derivation is called a
le�most derivation i�, in each derivation step, a productionis applied to the le�most non-terminal of the already derivedsentential form
rightmost derivation i�, in each derivation step, a productionis applied to the rightmost non-terminal of the already derivedsentential form
In the preceding example, the �rst derivation was a le�mostderivation and the second a rightmost derivation.
9
CFG
Le�most and rightmost derivationsA derivation is called a
le�most derivation i�, in each derivation step, a productionis applied to the le�most non-terminal of the already derivedsentential form
rightmost derivation i�, in each derivation step, a productionis applied to the rightmost non-terminal of the already derivedsentential form
In the preceding example, the �rst derivation was a le�mostderivation and the second a rightmost derivation.
9
CFG
For a single word w, there might even be more than one parse tree.
Ambiguous grammarsA CFG giving more than one parse tree for some word w is calledambiguous.
Ambiguous grammarConsider again Ga,b (S → AB |BA, A→ a | aS | bAA, B→ b | bS | aBB)
�e two parse treesfor aabb are
S
B
B
b
B
b
a
A
a
S
B
b
A
S
B
b
A
a
a
10
CFG
For a single word w, there might even be more than one parse tree.
Ambiguous grammarsA CFG giving more than one parse tree for some word w is calledambiguous.
Ambiguous grammarConsider again Ga,b (S → AB |BA, A→ a | aS | bAA, B→ b | bS | aBB)
�e two parse treesfor aabb are
S
B
B
b
B
b
a
A
a
S
B
b
A
S
B
b
A
a
a
10
CFG
For a single word w, there might even be more than one parse tree.
Ambiguous grammarsA CFG giving more than one parse tree for some word w is calledambiguous.
Ambiguous grammarConsider again Ga,b (S → AB |BA, A→ a | aS | bAA, B→ b | bS | aBB)
�e two parse treesfor aabb are
S
B
B
b
B
b
a
A
a
S
B
b
A
S
B
b
A
a
a
10
CFG
Some languages are such that their structure is necessarily ambiguous.Natural languages are probably such cases.
Inherently ambiguousA CFL L is called inherently ambiguous if each CFG G with L =L(G) is ambiguous.
Inherently ambiguous language
L = {anbncmdm | n ≥ 1,m ≥ 1} ∪ {anbmcmdn | n ≥ 1,m ≥ 1}
For words of the form akbkckdk one cannot tell wich of the twopa�erns is the right structure. Both are possible.
11
CFG
Some languages are such that their structure is necessarily ambiguous.Natural languages are probably such cases.
Inherently ambiguousA CFL L is called inherently ambiguous if each CFG G with L =L(G) is ambiguous.
Inherently ambiguous language
L = {anbncmdm | n ≥ 1,m ≥ 1} ∪ {anbmcmdn | n ≥ 1,m ≥ 1}
For words of the form akbkckdk one cannot tell wich of the twopa�erns is the right structure. Both are possible.
11
Removing useless symbols
An important grammar clean-up one has to do sometimes is theremoval of symbols (non-terminals or terminals) that cannot occur inany derivation of a word in the string language.
Useful/useless symbolsLet G = 〈N , T , P, S〉 be a CFG. An X ∈ N ∪ T is called
useful if there is a derivation S ∗⇒ αXβ ∗⇒ w with w ∈ T∗
useless otherwise
12
Removing useless symbols
An important grammar clean-up one has to do sometimes is theremoval of symbols (non-terminals or terminals) that cannot occur inany derivation of a word in the string language.
Useful/useless symbolsLet G = 〈N , T , P, S〉 be a CFG. An X ∈ N ∪ T is called
useful if there is a derivation S ∗⇒ αXβ ∗⇒ w with w ∈ T∗
useless otherwise
12
Removing useless symbols
Non-terminals X can be useless because they do not allow toderive a terminal string, i.e., there is no derivation X ∗⇒ w withw ∈ T∗.Non-terminals and terminals X can be useless because theycannot be reached from the start symbol, i.e., there is no deriva-tion S ∗⇒ αX .
Useless symbols1 G = 〈{S,A,B,C}, {a, b, c}, P, S〉 with
P = {S → AB | aS,A→ aaAc | c,B→ aCc}Useless: B and C since from them one cannot derive any termi-nal string.
2 G = 〈{S,A,B,C}, {a, b, c}, P, S〉 withP = {S → AB | aS,A→ aaAc | c,B→ ac,C → b}Useless: C and b since they are not reachable from S.
13
Removing useless symbols
Non-terminals X can be useless because they do not allow toderive a terminal string, i.e., there is no derivation X ∗⇒ w withw ∈ T∗.Non-terminals and terminals X can be useless because theycannot be reached from the start symbol, i.e., there is no deriva-tion S ∗⇒ αX .
Useless symbols1 G = 〈{S,A,B,C}, {a, b, c}, P, S〉 with
P = {S → AB | aS,A→ aaAc | c,B→ aCc}Useless: B and C since from them one cannot derive any termi-nal string.
2 G = 〈{S,A,B,C}, {a, b, c}, P, S〉 withP = {S → AB | aS,A→ aaAc | c,B→ ac,C → b}Useless: C and b since they are not reachable from S.
13
Removing useless symbols
To eliminate all useless symbols, two things need to be done.
1 All X ∈ N need to be eliminated that cannot lead to a terminalsequence.�is can be done recursively: Starting from the terminals andfollowing the productions from right to le�, the set of all sym-bols leading to terminals can be computed recursively.Productions containing symbols that are not in this set areeliminated.
2 In the resulting CFG, the unreachable symbols need to be elimi-nated.�is is done starting from S and applying productions. Eachtime, the symbols from the right-hand sides are added.Again, productions containing non-terminals or terminals thatare not in the set are eliminated.
14
Removing useless symbols
To eliminate all useless symbols, two things need to be done.1 All X ∈ N need to be eliminated that cannot lead to a terminal
sequence.�is can be done recursively: Starting from the terminals andfollowing the productions from right to le�, the set of all sym-bols leading to terminals can be computed recursively.Productions containing symbols that are not in this set areeliminated.
2 In the resulting CFG, the unreachable symbols need to be elimi-nated.�is is done starting from S and applying productions. Eachtime, the symbols from the right-hand sides are added.Again, productions containing non-terminals or terminals thatare not in the set are eliminated.
14
Removing useless symbols
To eliminate all useless symbols, two things need to be done.1 All X ∈ N need to be eliminated that cannot lead to a terminal
sequence.�is can be done recursively: Starting from the terminals andfollowing the productions from right to le�, the set of all sym-bols leading to terminals can be computed recursively.Productions containing symbols that are not in this set areeliminated.
2 In the resulting CFG, the unreachable symbols need to be elimi-nated.�is is done starting from S and applying productions. Eachtime, the symbols from the right-hand sides are added.Again, productions containing non-terminals or terminals thatare not in the set are eliminated.
14
Removing useless symbols
Removing useless symbolsG = 〈{S,A,B,C,D, E}, {a, b, c}, P, S〉 withP = {S → ABC |AB,A→ a | acC,B→ bb |CBB,C → D, E → B}
Nonterminals that can lead to sequences of terminals: {A,B, E, S}New set of productions a�er removal of C,D:S → AB,A→ a,B→ bb, E → B
Symbols that can be reached from the start symbol: {S,A,B, a, b}new set of productions over only this set S → AB,A→ a,B→ bb
Resultig CFG: 〈{A,B, S}, {a, b}, {S → AB,A→ a,B→ bb}, S〉
15
Removing useless symbols
Removing useless symbolsG = 〈{S,A,B,C,D, E}, {a, b, c}, P, S〉 withP = {S → ABC |AB,A→ a | acC,B→ bb |CBB,C → D, E → B}
Nonterminals that can lead to sequences of terminals: {A,B, E, S}New set of productions a�er removal of C,D:S → AB,A→ a,B→ bb, E → B
Symbols that can be reached from the start symbol: {S,A,B, a, b}new set of productions over only this set S → AB,A→ a,B→ bb
Resultig CFG: 〈{A,B, S}, {a, b}, {S → AB,A→ a,B→ bb}, S〉
15
Removing useless symbols
Removing useless symbolsG = 〈{S,A,B,C,D, E}, {a, b, c}, P, S〉 withP = {S → ABC |AB,A→ a | acC,B→ bb |CBB,C → D, E → B}
Nonterminals that can lead to sequences of terminals: {A,B, E, S}New set of productions a�er removal of C,D:S → AB,A→ a,B→ bb, E → B
Symbols that can be reached from the start symbol: {S,A,B, a, b}new set of productions over only this set S → AB,A→ a,B→ bb
Resultig CFG: 〈{A,B, S}, {a, b}, {S → AB,A→ a,B→ bb}, S〉
15
Removing useless symbols
Removing useless symbolsG = 〈{S,A,B,C,D, E}, {a, b, c}, P, S〉 withP = {S → ABC |AB,A→ a | acC,B→ bb |CBB,C → D, E → B}
Nonterminals that can lead to sequences of terminals: {A,B, E, S}New set of productions a�er removal of C,D:S → AB,A→ a,B→ bb, E → B
Symbols that can be reached from the start symbol: {S,A,B, a, b}new set of productions over only this set S → AB,A→ a,B→ bb
Resultig CFG: 〈{A,B, S}, {a, b}, {S → AB,A→ a,B→ bb}, S〉
15
Eliminating ε-rules
Let G = 〈N , T , P, S〉. A production of the form A→ ε is called anε-production.
�e following holds:
For each CFG G, there is a CFG G′ without ε-productions such thatL(G′) = L(G) \ {ε}.
Removing ε-productionsG = 〈{S, T}, {a, b, c, d}, P, S〉 withP = {S → aSb | aTb, T → cTd | ε}
Equivalent ε-free CFG:G = 〈{S, T}, {a, b, c, d}, P, S〉 withP = {S → aSb | aTb | ab, T → cTd | cd}
16
Eliminating ε-rules
In order to eliminate ε-productions, we
Compute the set Nε = {A |A∗⇒ ε} recursively
1 Nε := {A ∈ N |A⇒ ε}2 For all A with A→ α, α ∈ N∗ε : add A to Nε
3 Repeat 2. until Nε does not change any more
Delete the ε-productions and for each A→ X1 . . .Xn:add all productions one can obtain by deleting some Xj ∈ Nε
from the right-hand side as long as one does not delete allX1, . . . ,Xn.
17
Eliminating ε-rules
In order to eliminate ε-productions, weCompute the set Nε = {A |A
∗⇒ ε} recursively1 Nε := {A ∈ N |A⇒ ε}2 For all A with A→ α, α ∈ N∗ε : add A to Nε
3 Repeat 2. until Nε does not change any more
Delete the ε-productions and for each A→ X1 . . .Xn:add all productions one can obtain by deleting some Xj ∈ Nε
from the right-hand side as long as one does not delete allX1, . . . ,Xn.
17
Eliminating ε-rules
In order to eliminate ε-productions, weCompute the set Nε = {A |A
∗⇒ ε} recursively1 Nε := {A ∈ N |A⇒ ε}2 For all A with A→ α, α ∈ N∗ε : add A to Nε
3 Repeat 2. until Nε does not change any more
Delete the ε-productions and for each A→ X1 . . .Xn:add all productions one can obtain by deleting some Xj ∈ Nε
from the right-hand side as long as one does not delete allX1, . . . ,Xn.
17
Removing unary rules
Let G = 〈N , T , P, S〉. For A, B ∈ N , a production of the form A→ B iscalled a unary production
For each CFL that does not contain ε-rules, a CFG without unaryproductions can be found.
Removing unary rulesG = 〈{S, T}, {a, b, c, d}, P, S〉 with P = {S → aSb | T , T → cTd | cd}
Equivalent CFG without unary rules: G = 〈{S, T}, {a, b, c, d}, P, S〉with P = {S → aSb | cTd | cd, T → cTd | cd}
Elimination of unary productions for a CFG without ε-productions:1 For all A ∗⇒ B and all B→ β, β /∈ N :
add A→ β to the set of productions2 Delete all unary productions
18
Removing unary rules
Let G = 〈N , T , P, S〉. For A, B ∈ N , a production of the form A→ B iscalled a unary production
For each CFL that does not contain ε-rules, a CFG without unaryproductions can be found.
Removing unary rulesG = 〈{S, T}, {a, b, c, d}, P, S〉 with P = {S → aSb | T , T → cTd | cd}
Equivalent CFG without unary rules: G = 〈{S, T}, {a, b, c, d}, P, S〉with P = {S → aSb | cTd | cd, T → cTd | cd}
Elimination of unary productions for a CFG without ε-productions:1 For all A ∗⇒ B and all B→ β, β /∈ N :
add A→ β to the set of productions2 Delete all unary productions
18
Removing unary rules
Let G = 〈N , T , P, S〉. For A, B ∈ N , a production of the form A→ B iscalled a unary production
For each CFL that does not contain ε-rules, a CFG without unaryproductions can be found.
Removing unary rulesG = 〈{S, T}, {a, b, c, d}, P, S〉 with P = {S → aSb | T , T → cTd | cd}
Equivalent CFG without unary rules: G = 〈{S, T}, {a, b, c, d}, P, S〉with P = {S → aSb | cTd | cd, T → cTd | cd}
Elimination of unary productions for a CFG without ε-productions:1 For all A ∗⇒ B and all B→ β, β /∈ N :
add A→ β to the set of productions2 Delete all unary productions
18
Chomsky Normal Form
A normal form of a grammar formalism F is a further restriction onthe grammars in F that does not a�ect the set of generated stringlanguages.
�ere are two important normal forms for CFGs.
CFG normal formsA CFG G = 〈N , T , P, S〉 for a language without ε-rules is
1 in Chomsky normal form i� all productions have either theform A→ BC or A→ a with A,B,C ∈ N , a ∈ T
2 in Greibach normal form i� all productions have the formA→ aα with a ∈ T , α ∈ N ∗
19
Chomsky Normal Form
A normal form of a grammar formalism F is a further restriction onthe grammars in F that does not a�ect the set of generated stringlanguages.
�ere are two important normal forms for CFGs.
CFG normal formsA CFG G = 〈N , T , P, S〉 for a language without ε-rules is
1 in Chomsky normal form i� all productions have either theform A→ BC or A→ a with A,B,C ∈ N , a ∈ T
2 in Greibach normal form i� all productions have the formA→ aα with a ∈ T , α ∈ N ∗
19
Chomsky Normal Form
A normal form of a grammar formalism F is a further restriction onthe grammars in F that does not a�ect the set of generated stringlanguages.
�ere are two important normal forms for CFGs.
CFG normal formsA CFG G = 〈N , T , P, S〉 for a language without ε-rules is
1 in Chomsky normal form i� all productions have either theform A→ BC or A→ a with A,B,C ∈ N , a ∈ T
2 in Greibach normal form i� all productions have the formA→ aα with a ∈ T , α ∈ N ∗
19
Chomsky Normal Form
A normal form of a grammar formalism F is a further restriction onthe grammars in F that does not a�ect the set of generated stringlanguages.
�ere are two important normal forms for CFGs.
CFG normal formsA CFG G = 〈N , T , P, S〉 for a language without ε-rules is
1 in Chomsky normal form i� all productions have either theform A→ BC or A→ a with A,B,C ∈ N , a ∈ T
2 in Greibach normal form i� all productions have the formA→ aα with a ∈ T , α ∈ N ∗
19
Chomsky Normal Form
For each CFL L without ε, there is a CFG in Chomsky normal formwith L = L(G).
Construction of an equivalent CFG in CNF for a given CFGG = 〈N , T , P, S〉
1 Eliminate useless symbols2 Eliminate ε-productions3 Eliminate unary productions4 For each a ∈ T : introduce new non-terminal Ta, replace a by
Ta in all right-hand sides of length > 1 and add productionTa → a to P
5 For each production A→ B0 . . . Bn introduce new non-terminalsX1, . . . ,Xn−1 and replace production A→ B0 . . .Bn with
A→ B0X1 X1 → B1X2 . . . Xn−1 → Bn−1Bn
20
Chomsky Normal Form
For each CFL L without ε, there is a CFG in Chomsky normal formwith L = L(G).
Construction of an equivalent CFG in CNF for a given CFGG = 〈N , T , P, S〉
1 Eliminate useless symbols
2 Eliminate ε-productions3 Eliminate unary productions4 For each a ∈ T : introduce new non-terminal Ta, replace a by
Ta in all right-hand sides of length > 1 and add productionTa → a to P
5 For each production A→ B0 . . . Bn introduce new non-terminalsX1, . . . ,Xn−1 and replace production A→ B0 . . .Bn with
A→ B0X1 X1 → B1X2 . . . Xn−1 → Bn−1Bn
20
Chomsky Normal Form
For each CFL L without ε, there is a CFG in Chomsky normal formwith L = L(G).
Construction of an equivalent CFG in CNF for a given CFGG = 〈N , T , P, S〉
1 Eliminate useless symbols2 Eliminate ε-productions
3 Eliminate unary productions4 For each a ∈ T : introduce new non-terminal Ta, replace a by
Ta in all right-hand sides of length > 1 and add productionTa → a to P
5 For each production A→ B0 . . . Bn introduce new non-terminalsX1, . . . ,Xn−1 and replace production A→ B0 . . .Bn with
A→ B0X1 X1 → B1X2 . . . Xn−1 → Bn−1Bn
20
Chomsky Normal Form
For each CFL L without ε, there is a CFG in Chomsky normal formwith L = L(G).
Construction of an equivalent CFG in CNF for a given CFGG = 〈N , T , P, S〉
1 Eliminate useless symbols2 Eliminate ε-productions3 Eliminate unary productions
4 For each a ∈ T : introduce new non-terminal Ta, replace a byTa in all right-hand sides of length > 1 and add productionTa → a to P
5 For each production A→ B0 . . . Bn introduce new non-terminalsX1, . . . ,Xn−1 and replace production A→ B0 . . .Bn with
A→ B0X1 X1 → B1X2 . . . Xn−1 → Bn−1Bn
20
Chomsky Normal Form
For each CFL L without ε, there is a CFG in Chomsky normal formwith L = L(G).
Construction of an equivalent CFG in CNF for a given CFGG = 〈N , T , P, S〉
1 Eliminate useless symbols2 Eliminate ε-productions3 Eliminate unary productions4 For each a ∈ T : introduce new non-terminal Ta, replace a by
Ta in all right-hand sides of length > 1 and add productionTa → a to P
5 For each production A→ B0 . . . Bn introduce new non-terminalsX1, . . . ,Xn−1 and replace production A→ B0 . . .Bn with
A→ B0X1 X1 → B1X2 . . . Xn−1 → Bn−1Bn
20
Chomsky Normal Form
For each CFL L without ε, there is a CFG in Chomsky normal formwith L = L(G).
Construction of an equivalent CFG in CNF for a given CFGG = 〈N , T , P, S〉
1 Eliminate useless symbols2 Eliminate ε-productions3 Eliminate unary productions4 For each a ∈ T : introduce new non-terminal Ta, replace a by
Ta in all right-hand sides of length > 1 and add productionTa → a to P
5 For each production A→ B0 . . . Bn introduce new non-terminalsX1, . . . ,Xn−1 and replace production A→ B0 . . .Bn with
A→ B0X1 X1 → B1X2 . . . Xn−1 → Bn−1Bn
20
Chomsky Normal Form
B0 . . . Bn
A
;
B0
B1 X2
X1
A
Xn−1
Bn−1 Bn
21
Chomsky Normal Form
Transformation to CNFG = 〈{S,C}, {a,b,c,}, {S→ aSbC | ab, C→ c | cC}, S〉.
1 Introducing new preterminals:G = 〈{S, C, Ta, Tb, Tc}, {a, b, c}, P, S〉 with productionsS→ TaSTbC |TaTb, C→ c |TcC, Ta→ a, Tb→ b, Tc→ c
2 Binarization:G = 〈{S, C, Ta, Tb, Tc , X1, X2}, {a, b, c}, P, S〉 with productionsS→ TaX1 |TaTb, X1→ SX2, X2→ TbC, C→ c |TcC,
Ta→ a, Tb → b, Tc → c
22
Chomsky Normal Form
Transformation to CNFG = 〈{S,C}, {a,b,c,}, {S→ aSbC | ab, C→ c | cC}, S〉.
1 Introducing new preterminals:G = 〈{S, C, Ta, Tb, Tc}, {a, b, c}, P, S〉 with productionsS→ TaSTbC |TaTb, C→ c |TcC
, Ta→ a, Tb→ b, Tc→ c
2 Binarization:G = 〈{S, C, Ta, Tb, Tc , X1, X2}, {a, b, c}, P, S〉 with productionsS→ TaX1 |TaTb, X1→ SX2, X2→ TbC, C→ c |TcC,
Ta→ a, Tb → b, Tc → c
22
Chomsky Normal Form
Transformation to CNFG = 〈{S,C}, {a,b,c,}, {S→ aSbC | ab, C→ c | cC}, S〉.
1 Introducing new preterminals:G = 〈{S, C, Ta, Tb, Tc}, {a, b, c}, P, S〉 with productionsS→ TaSTbC |TaTb, C→ c |TcC, Ta→ a, Tb→ b, Tc→ c
2 Binarization:G = 〈{S, C, Ta, Tb, Tc , X1, X2}, {a, b, c}, P, S〉 with productionsS→ TaX1 |TaTb, X1→ SX2, X2→ TbC, C→ c |TcC,
Ta→ a, Tb → b, Tc → c
22
Chomsky Normal Form
Transformation to CNFG = 〈{S,C}, {a,b,c,}, {S→ aSbC | ab, C→ c | cC}, S〉.
1 Introducing new preterminals:G = 〈{S, C, Ta, Tb, Tc}, {a, b, c}, P, S〉 with productionsS→ TaSTbC |TaTb, C→ c |TcC, Ta→ a, Tb→ b, Tc→ c
2 Binarization:G = 〈{S, C, Ta, Tb, Tc , X1, X2}, {a, b, c}, P, S〉 with productionsS→ TaX1 |TaTb, X1→ SX2, X2→ TbC, C→ c |TcC,
Ta→ a, Tb → b, Tc → c
22
Greibach Normal Form
For each CFL L without ε, there is a CFG in Greibach normal formwith L = L(G).
Construction of the CFG in GNF for a given CFG G = 〈N , T , P, S〉1 Eliminate useless symbols2 Eliminate unary productions3 Eliminate ε-productions4 Le�-recursion is eliminated, i.e., a grammar is constructed that
does not allow derivations of the form A +⇒ Aα
Le�-recursive CFGA CFG G = 〈N , T , P, S〉 is le�-recursive i� there is a non-terminalA ∈ N and a sequence α ∈ (N ∪ T)+ such that A +⇒ Aα
23
Greibach Normal Form
For each CFL L without ε, there is a CFG in Greibach normal formwith L = L(G).
Construction of the CFG in GNF for a given CFG G = 〈N , T , P, S〉1 Eliminate useless symbols2 Eliminate unary productions3 Eliminate ε-productions
4 Le�-recursion is eliminated, i.e., a grammar is constructed thatdoes not allow derivations of the form A +⇒ Aα
Le�-recursive CFGA CFG G = 〈N , T , P, S〉 is le�-recursive i� there is a non-terminalA ∈ N and a sequence α ∈ (N ∪ T)+ such that A +⇒ Aα
23
Greibach Normal Form
For each CFL L without ε, there is a CFG in Greibach normal formwith L = L(G).
Construction of the CFG in GNF for a given CFG G = 〈N , T , P, S〉1 Eliminate useless symbols2 Eliminate unary productions3 Eliminate ε-productions4 Le�-recursion is eliminated, i.e., a grammar is constructed that
does not allow derivations of the form A +⇒ Aα
Le�-recursive CFGA CFG G = 〈N , T , P, S〉 is le�-recursive i� there is a non-terminalA ∈ N and a sequence α ∈ (N ∪ T)+ such that A +⇒ Aα
23
Greibach Normal Form
For each CFL L without ε, there is a CFG in Greibach normal formwith L = L(G).
Construction of the CFG in GNF for a given CFG G = 〈N , T , P, S〉1 Eliminate useless symbols2 Eliminate unary productions3 Eliminate ε-productions4 Le�-recursion is eliminated, i.e., a grammar is constructed that
does not allow derivations of the form A +⇒ Aα
Le�-recursive CFGA CFG G = 〈N , T , P, S〉 is le�-recursive i� there is a non-terminalA ∈ N and a sequence α ∈ (N ∪ T)+ such that A +⇒ Aα
23
Greibach Normal Form
Elimination of le�-recursionAssume the set of non-terminals to be ordered, i.e. N = {A1, . . . ,Am}
Construct a CFG with j > i if Ai → Ajγ:
For all Ai, 1 ≤ i ≤ m, steps (I) and (II) are done.
(I) Transformation such that j ≥ i if Ai → Ajγ:
Replace all productions Ai → Ajγ with j < i with new pro-ductions obtained from replacing Aj with all right-hand sidesof Aj-productions. Do this until the condition holds for all Ai-productions.
24
Greibach Normal Form
Elimination of le�-recursionAssume the set of non-terminals to be ordered, i.e. N = {A1, . . . ,Am}Construct a CFG with j > i if Ai → Ajγ:
For all Ai, 1 ≤ i ≤ m, steps (I) and (II) are done.
(I) Transformation such that j ≥ i if Ai → Ajγ:
Replace all productions Ai → Ajγ with j < i with new pro-ductions obtained from replacing Aj with all right-hand sidesof Aj-productions. Do this until the condition holds for all Ai-productions.
24
Greibach Normal Form
Elimination of le�-recursion IS → AB A→ S A→ a B→ b
We put indices on our non-terminals: B has index 1, A index 2 andS index 3 (other indexations work as well but lead to a di�erentgrammar)
S3 → A2B1 A2 → S3 A2 → a B1 → b
Problematic production with a righthand side (rhs) starting with anindex lower than the le�hand side index: S3 → A2B1
Replace A2 in this rule with the rhs of A2-productions. Our newproductions are
S3 → S3B1 S3 → aB1 A2 → S3 A2 → a B1 → b
25
Greibach Normal Form
Elimination of le�-recursion IS → AB A→ S A→ a B→ b
We put indices on our non-terminals: B has index 1, A index 2 andS index 3 (other indexations work as well but lead to a di�erentgrammar)
S3 → A2B1 A2 → S3 A2 → a B1 → b
Problematic production with a righthand side (rhs) starting with anindex lower than the le�hand side index: S3 → A2B1
Replace A2 in this rule with the rhs of A2-productions. Our newproductions are
S3 → S3B1 S3 → aB1 A2 → S3 A2 → a B1 → b
25
Greibach Normal Form
Elimination of le�-recursion IS → AB A→ S A→ a B→ b
We put indices on our non-terminals: B has index 1, A index 2 andS index 3 (other indexations work as well but lead to a di�erentgrammar)
S3 → A2B1 A2 → S3 A2 → a B1 → b
Problematic production with a righthand side (rhs) starting with anindex lower than the le�hand side index: S3 → A2B1
Replace A2 in this rule with the rhs of A2-productions. Our newproductions are
S3 → S3B1 S3 → aB1 A2 → S3 A2 → a B1 → b
25
Greibach Normal Form
Elimination of le�-recursion IS → AB A→ S A→ a B→ b
We put indices on our non-terminals: B has index 1, A index 2 andS index 3 (other indexations work as well but lead to a di�erentgrammar)
S3 → A2B1 A2 → S3 A2 → a B1 → b
Problematic production with a righthand side (rhs) starting with anindex lower than the le�hand side index: S3 → A2B1
Replace A2 in this rule with the rhs of A2-productions. Our newproductions are
S3 → S3B1 S3 → aB1 A2 → S3 A2 → a B1 → b
25
Greibach Normal Form
Idea behind the next step:
Assume that for A, we have productions
A→ Aα1, . . . , A→ Aαr , A→ β1, . . . , A→ βs
�is means that we can have a derivation
A⇒ Aαi1 ⇒ Aαi1αi2 ⇒ · · ·Aαi1 . . . αin ⇒ βjαi1 . . . αin
where the α-parts are from {α1, . . . , αr} and the β-part is from{β1, . . . , βs}.We can also generate this by a right-recursion using a newnon-terminal B that generates a sequence of elements from{α1, . . . , αr}:
A⇒ βjB⇒ βjαi1B⇒ βjαi1αi2B⇒ · · · ⇒ βjαi1 . . . αin
26
Greibach Normal Form
(II) Elimination of le�-recursive productions Ai → Aiα:
Add a new non-terminal B and replace
Ai → Aiα1, . . . , Ai → Aiαr , Ai → β1, . . . , Ai → βs
withAi → βi, Ai → βiB ∀i, 1 ≤ i ≤ s
andB→ αi, B→ αiB ∀i, 1 ≤ i ≤ r
(Le� recursion is turned into right recursion)
In the resulting grammar, for all Ai+⇒ Ajα, i < j holds.
27
Greibach Normal Form
(II) Elimination of le�-recursive productions Ai → Aiα:
Add a new non-terminal B and replace
Ai → Aiα1, . . . , Ai → Aiαr , Ai → β1, . . . , Ai → βs
withAi → βi, Ai → βiB ∀i, 1 ≤ i ≤ s
andB→ αi, B→ αiB ∀i, 1 ≤ i ≤ r
(Le� recursion is turned into right recursion)
In the resulting grammar, for all Ai+⇒ Ajα, i < j holds.
27
Greibach Normal Form
(II) Elimination of le�-recursive productions Ai → Aiα:
Add a new non-terminal B and replace
Ai → Aiα1, . . . , Ai → Aiαr , Ai → β1, . . . , Ai → βs
withAi → βi, Ai → βiB ∀i, 1 ≤ i ≤ s
andB→ αi, B→ αiB ∀i, 1 ≤ i ≤ r
(Le� recursion is turned into right recursion)
In the resulting grammar, for all Ai+⇒ Ajα, i < j holds.
27
Greibach Normal Form
Elimination of le�-recursion IIS3 → S3B1 S3 → aB1 A2 → S3 A2 → a B1 → b
We introduce a new non-terminal C andreplace S3 → S3B1, S3 → aB1with S3 → aB1, S3 → aB1C, C → B1C, C → B1.
Result:S3 → aB1 S3 → aB1C C → B1C C → B1A2 → S3 A2 → a B1 → b
A�er further removal of the now useless A2 and of the unary pro-duction, we obtainS3 → aB1 S3 → aB1C C → B1C C → b B1 → b
28
Greibach Normal Form
Elimination of le�-recursion IIS3 → S3B1 S3 → aB1 A2 → S3 A2 → a B1 → b
We introduce a new non-terminal C andreplace S3 → S3B1, S3 → aB1with S3 → aB1, S3 → aB1C, C → B1C, C → B1.
Result:S3 → aB1 S3 → aB1C C → B1C C → B1A2 → S3 A2 → a B1 → b
A�er further removal of the now useless A2 and of the unary pro-duction, we obtainS3 → aB1 S3 → aB1C C → B1C C → b B1 → b
28
Greibach Normal Form
Elimination of le�-recursionWe could also start with di�erent indices, e.g.,
S2 → A3B1 A3 → S2 A3 → a B1 → b
A�er step I, we would haveS2 → A3B1 A3 → A3B1 A3 → a B1 → b
A�er step II, the result would beS2 → A3B1 A3 → a A3 → aC C → B1 C → B1C B1 → b
A�er elimination of the unary production C → B1, this yieldsS2 → A3B1 A3 → a A3 → aC C → b C → B1C B1 → b
29
Greibach Normal Form
Elimination of le�-recursionWe could also start with di�erent indices, e.g.,
S2 → A3B1 A3 → S2 A3 → a B1 → b
A�er step I, we would haveS2 → A3B1 A3 → A3B1 A3 → a B1 → b
A�er step II, the result would beS2 → A3B1 A3 → a A3 → aC C → B1 C → B1C B1 → b
A�er elimination of the unary production C → B1, this yieldsS2 → A3B1 A3 → a A3 → aC C → b C → B1C B1 → b
29
Greibach Normal Form
Elimination of le�-recursionWe could also start with di�erent indices, e.g.,
S2 → A3B1 A3 → S2 A3 → a B1 → b
A�er step I, we would haveS2 → A3B1 A3 → A3B1 A3 → a B1 → b
A�er step II, the result would beS2 → A3B1 A3 → a A3 → aC C → B1 C → B1C B1 → b
A�er elimination of the unary production C → B1, this yieldsS2 → A3B1 A3 → a A3 → aC C → b C → B1C B1 → b
29
Greibach Normal Form
Elimination of le�-recursionWe could also start with di�erent indices, e.g.,
S2 → A3B1 A3 → S2 A3 → a B1 → b
A�er step I, we would haveS2 → A3B1 A3 → A3B1 A3 → a B1 → b
A�er step II, the result would beS2 → A3B1 A3 → a A3 → aC C → B1 C → B1C B1 → b
A�er elimination of the unary production C → B1, this yieldsS2 → A3B1 A3 → a A3 → aC C → b C → B1C B1 → b
29
Greibach Normal Form
Transformation to GNFG = 〈N , T , P, S〉, N = {A1,A2,A3,B}, T = {a, b}
A1 → A2A3 A2 → A3A1 | b A3 → A1A2 | a
A1 → A2A3 A2 → A3A1 | b A3 → A2A3A2 | aA1 → A2A3 A2 → A3A1 | b A3 → A3A1A3A2 | bA3A2 | a
Replace A3 → A3A1A3A2 | bA3A2 | a by
A3 → bA3A2 A3 → a A3 → bA3A2B A3 → aBB→ A1A3A2 B→ A1A3A2B
�e resulting grammar does not allow le�-recursive derivations.
30
Greibach Normal Form
Transformation to GNFG = 〈N , T , P, S〉, N = {A1,A2,A3,B}, T = {a, b}
A1 → A2A3 A2 → A3A1 | b A3 → A1A2 | aA1 → A2A3 A2 → A3A1 | b A3 → A2A3A2 | a
A1 → A2A3 A2 → A3A1 | b A3 → A3A1A3A2 | bA3A2 | a
Replace A3 → A3A1A3A2 | bA3A2 | a by
A3 → bA3A2 A3 → a A3 → bA3A2B A3 → aBB→ A1A3A2 B→ A1A3A2B
�e resulting grammar does not allow le�-recursive derivations.
30
Greibach Normal Form
Transformation to GNFG = 〈N , T , P, S〉, N = {A1,A2,A3,B}, T = {a, b}
A1 → A2A3 A2 → A3A1 | b A3 → A1A2 | aA1 → A2A3 A2 → A3A1 | b A3 → A2A3A2 | aA1 → A2A3 A2 → A3A1 | b A3 → A3A1A3A2 | bA3A2 | a
Replace A3 → A3A1A3A2 | bA3A2 | a by
A3 → bA3A2 A3 → a A3 → bA3A2B A3 → aBB→ A1A3A2 B→ A1A3A2B
�e resulting grammar does not allow le�-recursive derivations.
30
Greibach Normal Form
Transformation to GNFG = 〈N , T , P, S〉, N = {A1,A2,A3,B}, T = {a, b}
A1 → A2A3 A2 → A3A1 | b A3 → A1A2 | aA1 → A2A3 A2 → A3A1 | b A3 → A2A3A2 | aA1 → A2A3 A2 → A3A1 | b A3 → A3A1A3A2 | bA3A2 | a
Replace A3 → A3A1A3A2 | bA3A2 | a by
A3 → bA3A2 A3 → a A3 → bA3A2B A3 → aBB→ A1A3A2 B→ A1A3A2B
�e resulting grammar does not allow le�-recursive derivations.
30
Greibach Normal Form
Transformation to GNFG = 〈N , T , P, S〉, N = {A1,A2,A3,B}, T = {a, b}
A1 → A2A3 A2 → A3A1 | b A3 → A1A2 | aA1 → A2A3 A2 → A3A1 | b A3 → A2A3A2 | aA1 → A2A3 A2 → A3A1 | b A3 → A3A1A3A2 | bA3A2 | a
Replace A3 → A3A1A3A2 | bA3A2 | a by
A3 → bA3A2 A3 → a A3 → bA3A2B A3 → aBB→ A1A3A2 B→ A1A3A2B
�e resulting grammar does not allow le�-recursive derivations.
30
Greibach Normal Form
LexicalizationNow, two more steps are necessary to obtain the grammar inGNF:
5 For i = m − 1 to i = 1: Replace all Ai → Ajβ with all pro-ductions obtained by replacing Aj with the right-hand side of aAj-production
�en, do the same for all B-productions where B is one of thesymbols introduced in step (II)
6 For all productions A → αaβ, α 6= ε: replace a with a newnon-terminal Ta and add a production Ta → a to P .
31
Greibach Normal Form
LexicalizationNow, two more steps are necessary to obtain the grammar inGNF:
5 For i = m − 1 to i = 1: Replace all Ai → Ajβ with all pro-ductions obtained by replacing Aj with the right-hand side of aAj-production
�en, do the same for all B-productions where B is one of thesymbols introduced in step (II)
6 For all productions A → αaβ, α 6= ε: replace a with a newnon-terminal Ta and add a production Ta → a to P .
31
Greibach Normal Form
LexicalizationNow, two more steps are necessary to obtain the grammar inGNF:
5 For i = m − 1 to i = 1: Replace all Ai → Ajβ with all pro-ductions obtained by replacing Aj with the right-hand side of aAj-production
�en, do the same for all B-productions where B is one of thesymbols introduced in step (II)
6 For all productions A → αaβ, α 6= ε: replace a with a newnon-terminal Ta and add a production Ta → a to P .
31
Greibach Normal Form
Transformation to GNF (cont’d)
A1 → A2A3A2 → A3A1 | bA3 → bA3A2 | a | bA3A2B | aBB→ A1A3A2 |A1A3A2B
Replace
1 A2 → A3A1 byA2 → bA3A2A1 | aA1 | bA3A2BA1 | aBA1
2 A1 → A2A3 byA1 → bA3A2A1A3 | . . . | aBA1A3 | bA3
3 B→ A1A3A2 byB→ bA3A2A1A3A3A2 | . . . | bA3A3A2
4 B→ A1A3A2B byB→ bA3A2A1A3A3A2B | . . . | bA3A3A2B
32
Greibach Normal Form
Transformation to GNF (cont’d)
A1 → A2A3A2 → A3A1 | bA3 → bA3A2 | a | bA3A2B | aBB→ A1A3A2 |A1A3A2B
Replace
1 A2 → A3A1 byA2 → bA3A2A1 | aA1 | bA3A2BA1 | aBA1
2 A1 → A2A3 byA1 → bA3A2A1A3 | . . . | aBA1A3 | bA3
3 B→ A1A3A2 byB→ bA3A2A1A3A3A2 | . . . | bA3A3A2
4 B→ A1A3A2B byB→ bA3A2A1A3A3A2B | . . . | bA3A3A2B
32
Greibach Normal Form
Transformation to GNF (cont’d)
A1 → A2A3A2 → A3A1 | bA3 → bA3A2 | a | bA3A2B | aBB→ A1A3A2 |A1A3A2B
Replace
1 A2 → A3A1 byA2 → bA3A2A1 | aA1 | bA3A2BA1 | aBA1
2 A1 → A2A3 byA1 → bA3A2A1A3 | . . . | aBA1A3 | bA3
3 B→ A1A3A2 byB→ bA3A2A1A3A3A2 | . . . | bA3A3A2
4 B→ A1A3A2B byB→ bA3A2A1A3A3A2B | . . . | bA3A3A2B
32
Greibach Normal Form
Transformation to GNF (cont’d)
A1 → A2A3A2 → A3A1 | bA3 → bA3A2 | a | bA3A2B | aBB→ A1A3A2 |A1A3A2B
Replace
1 A2 → A3A1 byA2 → bA3A2A1 | aA1 | bA3A2BA1 | aBA1
2 A1 → A2A3 byA1 → bA3A2A1A3 | . . . | aBA1A3 | bA3
3 B→ A1A3A2 byB→ bA3A2A1A3A3A2 | . . . | bA3A3A2
4 B→ A1A3A2B byB→ bA3A2A1A3A3A2B | . . . | bA3A3A2B
32
Greibach Normal Form
Transformation to GNF (cont’d)
A1 → A2A3A2 → A3A1 | bA3 → bA3A2 | a | bA3A2B | aBB→ A1A3A2 |A1A3A2B
Replace
1 A2 → A3A1 byA2 → bA3A2A1 | aA1 | bA3A2BA1 | aBA1
2 A1 → A2A3 byA1 → bA3A2A1A3 | . . . | aBA1A3 | bA3
3 B→ A1A3A2 byB→ bA3A2A1A3A3A2 | . . . | bA3A3A2
4 B→ A1A3A2B byB→ bA3A2A1A3A3A2B | . . . | bA3A3A2B
32