Module 32
description
Transcript of Module 32
1
Module 32
• Chomsky Normal Form (CNF)– 4 step process
2
Chomsky Normal Form
• A CFG is in Chomsky normal form (CNF) if every production is one of these two types:
A → BCA → a
• Key ideas– Eliminating λ-productions (e.g. S → λ)– Eliminating unit productions (e.g. A → B)
3
Nullable Variables
• A variable A in a CFG G = (V, Σ, S, P) is defined as nullable if:– Base case: P contains the production A → λ – Recursive case: P contains the production
A → B1B2 … Bn and B1 through Bn are nullable
• No other variables are nullable
4
Finding Nullable Variables
• Initialize N0 to be the set of nullable variables by the base case definition
• i = 0;• do
– i = i+1;– Ni = Ni-1 union {A | P contains A → α where α is a
string in Ni-1*}
• while Ni ≠ Ni-1;
• The final Ni is the set of nullable variables
5
Eliminating λ-productions
• Given CFG G = (V, Σ, S, P), construct a CFG G1 = (V, Σ, S, P1) as follows.
• Initialize P1 = P• Find set of all nullable variables N in G• For every production A → α in P, add to P1 every production that can
be obtained from this one by deleting from α one or more of the occurrences of nullable variables in α– Example: A → BBCD where B and C are nullable leads to
A → BCD | BCD | BBD | BD | BD | CD | D• Clean up
– Delete all λ-productions from P1
– Delete any duplicate productions– Delete any productions of the form A → A
• Thm: L(G1) = L(G) – {λ}
6
A-derivable Variables
• B is A-derivable in a CFG G if and only if A ==>G* B
• Recursive definition: A variable B in a CFG G = (V, Σ, S, P) is defined as A-derivable if:– Base case: P contains the production A → B or B = A– Recursive case: Variable C is A-derivable and P
contains the production C → B
• No other variables are A-derivable• Easy to make into algorithm
7
Eliminating unit productions
• Given CFG G = (V, Σ, S, P), construct a CFG G1 = (V, Σ, S, P1) as follows.
• Initialize P1 = P• Find each A in V, find set of A-derivable variables in V• For each pair (A,B) such that B is A-derivable and every
non-unit production B → α in P, add production A → α in P1
• Clean up– Delete all unit productions from P1
– Delete any duplicate productions
• Thm: L(G1) = L(G) if G did not have any λ-productions
8
Making CNF grammar
• First eliminate λ-productions• Then eliminate unit productions• For each terminal a in Σ, introduce a variable Xa with production rule
Xa → a• For each production of the form A → α where terminal a appears and |
α| > 1, replace a with Xa
• Finally, replace productions of the form A → B1B2 … Bn with a series of productions:– A →B1Y1
– Y1 →B21Y2
–....
– Yn-2 →Bn-1Bn
– Other methods can be used for this last step
9
Example: Eliminate λ-productions
• S → AACD• A → aAb | λ• C → aC | a• D → aDa | bDb | λ• Nullable variables: A & D• New grammar
– S → AACD | ACD | AAC | CD | AC | C– A → aAb | ab– C → aC | a– D → aDa | bDb | aa | bb
10
Example: Eliminate unit productions
• S → AACD | ACD | AAC | CD | AC | C• A → aAb | ab• C → aC | a• D → aDa | bDb | aa | bb• New grammar:
– S → AACD | ACD | AAC | CD | AC | aC | a– A → aAb | ab– C → aC | a– D → aDa | bDb | aa | bb
11
Example: Add Xa and Xb
• S → AACD | ACD | AAC | CD | AC | aC | a• A → aAb | ab• C → aC | a• D → aDa | bDb | aa | bb• New grammar
– S → AACD | ACD | AAC | CD | AC | XaC | a– A → XaAXb | XaXb
– C → XaC | a– D → XaDXa | XbDXb | XaXa | XbXb
– Xa→ a– Xb→ b
12
Example: Shorten long productions
• S → AACD | ACD | AAC | CD | AC | XaC | a
• A → XaAXb | XaXb
• C → XaC | a
• D → XaDXa | XbDXb | XaXa | XbXb
• Xa→ a
• Xb→ b
• Example replacement– S → AACD becomes S → AT1, T1 → AT2, T2 → CD
13
Observations
• Consider a derivation from a CNF grammar G that begins:S ==>G ABCD
• How short can the final derived terminal string be?
• Why?
14
Observation 2
• A path in a parse tree has length x if it contains x variables
• Consider a parse tree T for a string x and a CNF grammar G with m variables.– Suppose the longest path in T has length k.
How long can this string x be?– Suppose string x has length 2m. How short can
the longest path in T be?