Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

29
Lecture 17 Naveen Z Quazilbash Simplification of Grammars

Transcript of Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

Page 1: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

Lecture 17Naveen Z Quazilbash

Simplification of Grammars

Page 2: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

OverviewAttendanceMotivationSimplification of Grammars

Eliminating useless variablesEliminating null productionsEliminating unit productions

Quiz result

Page 3: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

Motivation for grammar simplificationParsing Problem

Given a CFG G and string w, determine if w ϵ L(G).Fundamental problem in compiler design and natural

language processingIf G is in general form then the procedure

maybe very inefficient. So the grammar is “transformed” into a simpler form to make the parsing problem easier.

Page 4: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

Simplification of GrammarsIt involves the removal of:

1. Useless variables2. ε-productions3. Unit productions

Page 5: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

Useless variables:There are two types of useless variables:

1. Variables that cannot be reached2. Variables that do not derive any strings

Page 6: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

3. ε-productionsE.g.: Aε

• Note that if we remove these productions, the language no longer includes the empty string.

Page 7: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

4. Unit productions:

They are of the form ABOrAA

Page 8: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

1) Unreachable Variables E.g.:

SBS|B|EADA|D|SBCB|CCaC|aDbD|bEcE|c

Page 9: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

To find unreachable variables, draw a dependency graph

Dependency Graph:Vertices of the graph are variablesThe graph doesn’t include alphabet symbols,

such as “a” or “b”If there is a production A…..B…, i.e., the left

side is A and the right side includes B, then there is an edge AB

Page 10: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

A variable is reachable if there is a path from S to this variable

S itself is always reachable

After identifying unreachable variables, remove all productions with unreachable left side.

Page 11: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

SBS|B|EADA|D|SBCB|CCaC|aDbD|bEcE|c

Drawing its dependency graph:Reachable: S, B, C, E

S

DAE

CB

Page 12: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

Grammar without unreachable variables:SBS|B|EBCB|CCaC|aEcE|c

Ex: Determine its language!!

Page 13: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

2) Variables that don’t terminateA variable A terminates if either:

There is a production A…. with no variables on the right, e.g. Aaabc,

ORThere is a production A… where all variables

on the right terminate; e.g. AaBbaC, where B and C terminate.

Note: to find all variables that terminate, keep looking for such productions until you cannot find any new ones.

Page 14: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

TASKExample: SA|BC|DEAaA|bABbB|bCEFDdD|BD|BAEaE|aFcFc|cRemove all productions that include a variable that

doesn’t terminate. Note: We remove a production if it has such a

variable on either side.

Page 15: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

Solutionx SA|BC|DEX AaA|bAx BbB|bx CEFX DdD|BD|BAx EaE|ax FcFc|c

Page 16: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

SBCBbB|bCEFEaE|aFcFc|c

Ex: Determine its language.

Page 17: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

3) Eliminating ε-ProductionsNullable variables:A variable is nullable if either:

There is a production A ε, orThere is a production AB1B2…Bn(only

variables, no symbols), where all variables on the right side are nullable.

Note: to find all nullable variables, keep looking for such productions, until you cannot find any new ones.

Page 18: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

TASKSSAB|SBC|BCAaA|aBbB|bC|CCcC| ε

First we find variables that can lead to the empty string:C=> εB=>C=> εS=>BC=>B=>C=> ε

Page 19: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

x SSAB|SBC|BCAaA|a

x BbB|bC|Cx CcC| ε

Thus, S, B, and C can lead to ε; they are called nullable variables

Page 20: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

For each production that has nullable variables, consider all possible ways to skip some of these variables and add the corresponding productions.

E.g. WaWXaYZb, suppose that X, Y and Z are nullable; then there are 8 ways to skip some of them.

WaWab|aWXab|aWaYb|aWaZb|aWXaYb|aWXaZb|

aWaYZb|aWXaYZb

Page 21: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

Back to our grammar where S,B and C are nullable:SA|AB|SA|SAB|S|B|C|SB|BC|SBCAaA|aBb|bB|bC|CCc|cC|ε

Now, we can remove the ε- productions without changing the language.

The only possible change is losing the empty string, if it is in the original language.

Page 22: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

So our grammar without null productions becomes:

SA|AB|SA|SAB|S|B|C|SB|BC|SBCAaA|aBb|bB|bC|CCc|cC

Page 23: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

4) Eliminating Unit ProductionsSAa|BAa|bc|BBA|bb|C|cCCa|CFirst, for every variable, we find all single

variables that can be reached from it:For S: S=>B=>A, S=>B=>CFor A: A=>B=>CFor B: B=>A, B=>CFor C: NONE (C itself doesn’t count)

Page 24: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

For finding reachable single variables, what should we do?

Page 25: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

Use Dependency Graph!Drawing Dependency Graph:

Vertices of the graph are variables.If there is a unit production AB, then there is

an edge AB. A single variable is reachable from A if there

is a pth from A to B.

Page 26: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

Dependency Graph:

S

A

B

C

Page 27: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

To construct an equivalent grammar without unit productions:Remove all unit productionsFor each pair A=>*B, where B is a single

variable reachable from A, consider all productions Bp1|p2|…|pn; and add the corresponding productions A p1|p2|…|pn.

for example, since A=>*B and Bbb|cC, add the productions Abb|cC

Page 28: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

SAa|BAa|bc|BBA|bb|C|cCCa|C

SAaBbb|cCAa|bcCa

Note that the variable B has become useless and we need to remove it!

Sbb|cC|a|bc|a

Ba|bc|aAbb|cC|aCa

Old non-unit productions

new productions

Page 29: Lecture 17 Naveen Z Quazilbash Simplification of Grammars.

SummaryMain steps of simplifying a grammar:

Remove useless variables, which cannot be reached or do not terminate.

Remove ε- productions.Remove unit productions.Remove useless variables again!