LL(1) Parsing Table Example - University of...

of 20 /20
1 CS 1622 Lecture 10 1 CS1622 Lecture 10 Parsing (5) CS 1622 Lecture 10 2 LL(1) Parsing Table Example Left-factored grammar E T X X + E | ε T ( E ) | int Y Y * T | ε The LL(1) parsing table: ε ε ε * T Y ( E ) int Y T ε ε + E X T X T X E $ ) ( + * int CS 1622 Lecture 10 3 LL(1) Parsing Table Example (Cont.) Consider the [E, int] entry “When current non-terminal is E and next input is int, use production E T X This production can generate an int in the first place Consider the [Y,+] entry “When current non-terminal is Y and current token is +, get rid of Y” Y can be followed by + only in a derivation in which Y ε

Embed Size (px)

Transcript of LL(1) Parsing Table Example - University of...

  • 1

    CS 1622 Lecture 10 1

    CS1622

    Lecture 10Parsing (5)

    CS 1622 Lecture 10 2

    LL(1) Parsing Table Example Left-factored grammar

    E T X X + E | T ( E ) | int Y Y * T |

    The LL(1) parsing table:

    * TY( E )int YT

    + EXT XT XE

    $)(+*int

    CS 1622 Lecture 10 3

    LL(1) Parsing Table Example(Cont.) Consider the [E, int] entry

    When current non-terminal is E and next input isint, use production E T X

    This production can generate an int in the firstplace

    Consider the [Y,+] entry When current non-terminal is Y and current token

    is +, get rid of Y Y can be followed by + only in a derivation in

    which Y

  • 2

    CS 1622 Lecture 10 4

    LL(1) Parsing Tables. Errors Blank entries indicate error situations

    Consider the [E,*] entry There is no way to derive a string starting

    with * from non-terminal E

    CS 1622 Lecture 10 5

    Using Parsing Tables Method similar to recursive descent, except

    For each non-terminal S We look at the next token a And chose the production shown at [S,a]

    We use a stack to keep track of pending non-terminals - instead of procedure stack!

    We reject when we encounter an error state We accept when we encounter end-of-input

    CS 1622 Lecture 10 6

    LL(1) Parsing Algorithminitialize stack = and nextrepeat case stack of : if T[X,*next] = Y1Yn then stack ; else error (); : if t == *next ++ then stack ; else error ();until stack == < >

  • 3

    CS 1622 Lecture 10 7

    LL(1) Parsing ExampleStack Input ActionE $ int * int $ T XT X $ int * int $ int Yint Y X $ int * int $ terminalY X $ * int $ * T* T X $ * int $ terminalT X $ int $ int Yint Y X $ int $ terminalY X $ $ X $ $ $ $ ACCEPT

    CS 1622 Lecture 10 8

    Constructing Parsing Tables LL(1) languages are those defined by a

    parsing table for the LL(1) algorithm No table entry can be multiply defined We want to generate parsing tables

    from CFG

    CS 1622 Lecture 10 9

    Constructing Parsing Tables(Cont.) If A , where in the line of A we place ? In the column of t where t can start a string

    derived from * t We say that t First()

    In the column of t if is and t can follow anA S * A t We say t Follow(A)

  • 4

    CS 1622 Lecture 10 10

    Computing First SetsDefinition First(X) = { t | X * t} { | X * }Algorithm: First(t) = { t } First(X) if X is a production First(X) if X A1 An

    and First(Ai) for 1 i n First() First(X) if X A1 An

    and First(Ai) for 1 i n

    CS 1622 Lecture 10 11

    First Sets. Example Recall the grammar

    E T X X + E | T ( E ) | int Y Y * T |

    First sets First( ( ) = { ( } First( T ) = {int, ( } First( ) ) = { ) } First( E ) = {int, ( } First( int) = { int } First( X ) = {+, } First( + ) = { + } First( Y ) = {*, } First( * ) = { * }

    CS 1622 Lecture 10 12

    Computing Follow Sets forNonterminals Follow(S) needed for productions that

    generate the empty string Definition: Follow(X) = { t | S * X t }

    Note that CAN NEVER BE IN FOLLOW(X)!! Intuition

    If X A B then First(B) Follow(A) and Follow(X) Follow(B) Also if B * then Follow(X) Follow(A) If S is the start symbol then $ Follow(S)

  • 5

    CS 1622 Lecture 10 13

    Computing Follow Sets (Cont.)Algorithm sketch: $ Follow(S) For each production A X

    First() - {} Follow(X)

    For each production A X where First()Follow(A) Follow(X)

    CS 1622 Lecture 10 14

    Follow Sets. Example Recall the grammar

    E T X X + E | T ( E ) | int Y Y * T |

    Follow sets

    Follow( E ) = {), $}Follow( X ) = {$, ) }Follow( T ) = {+, ) , $}Follow( Y ) = {+, ) , $}

    CS 1622 Lecture 10 15

    Constructing LL(1) ParsingTables Construct a parsing table T for CFG G

    For each production A in G do: For each terminal t First() do

    T[A, t] = If First(), for each t Follow(A) do

    T[A, t] = If First() and $ Follow(A) do

    T[A, $] =

  • 6

    CS 1622 Lecture 10 16

    Notes on LL(1) Parsing Tables If any entry is multiply defined then G is not

    LL(1). Possible reasons: If G is ambiguous If G is left recursive If G is not left-factored Grammar is not LL(1)

    Most programming language grammars arenot LL(1) Some can be made LL(1) though but others cant

    There are tools that build LL(1) tables E.g. LLGen

    CS 1622 Lecture 10 17

    Bottom-Up Parsing Bottom-up parsing is more general than top-

    down parsing And just as efficient Builds on ideas in top-down parsing

    Bottom-up is the preferred method in practice

    Concepts first, algorithms next time

    CS 1622 Lecture 10 18

    An Introductory Example Bottom-up parsers dont need left-factored

    grammars

    Hence we can revert to the naturalgrammar for our example:

    E T + E | TT int * T | int | (E)

    Consider the string: int * int + int

  • 7

    CS 1622 Lecture 10 19

    Bottom-Up Parsing

    CS 1622 Lecture 10 20

    The IdeaBottom-up parsing reduces a string to the

    start symbol by inverting productions:

    EE T + ET + EE TT + TT intT + intT int * Tint * T + intT intint * int + int

    CS 1622 Lecture 10 21

    Observation Read the productions in parse in

    reverse (i.e., from bottom to top) This is a rightmost derivation!

    EE T + ET + EE TT + TT intT + intT int * Tint * T + intT intint * int + int

  • 8

    CS 1622 Lecture 10 22

    Important Fact #1

    Important Fact #1 about bottom-upparsing:

    A bottom-up parser traces a rightmostderivation in reverse

    CS 1622 Lecture 10 23

    A Bottom-up Parse

    ET + ET + TT + intint * T + intint * int + int E

    T E

    + int*int

    T

    int

    T

    CS 1622 Lecture 10 24

    A Bottom-up Parse in Detail(1)

    + int*int int

    int * int + int

  • 9

    CS 1622 Lecture 10 25

    A Bottom-up Parse in Detail(2)

    int * T + intint * int + int

    + int*int int

    T

    CS 1622 Lecture 10 26

    A Bottom-up Parse in Detail(3)

    T + intint * T + intint * int + int

    T

    + int*int int

    T

    CS 1622 Lecture 10 27

    A Bottom-up Parse in Detail(4)

    T + TT + intint * T + intint * int + int

    T

    + int*int

    T

    int

    T

  • 10

    CS 1622 Lecture 10 28

    A Bottom-up Parse in Detail(5)

    T + ET + TT + intint * T + intint * int + int

    T E

    + int*int

    T

    int

    T

    CS 1622 Lecture 10 29

    A Bottom-up Parse in Detail(6)

    ET + ET + TT + intint * T + intint * int + int E

    T E

    + int*int

    T

    int

    T

    CS 1622 Lecture 10 30

    A Trivial Bottom-Up ParsingAlgorithmLet I = input string

    repeatpick a non-empty substring of I

    where X is a productionif no such , backtrackreplace one by X in I

    until I = S (the start symbol) or allpossibilities are exhausted

  • 11

    CS 1622 Lecture 10 31

    Questions Does this algorithm terminate?

    How fast is the algorithm?

    Does the algorithm handle all cases?

    How do we choose the substring to reduce ateach step?

    CS 1622 Lecture 10 32

    Where Do Reductions HappenImportant Fact #1 has an interesting

    consequence: Let be a step of a bottom-up parse Assume the next reduction is by X Then is a string of terminals

    Why? Because X is a step in aright-most derivation

    CS 1622 Lecture 10 33

    Notation Idea: Split string into two substrings

    Right substring is as yet unexamined by parsing(a string of terminals)

    Left substring has terminals and non-terminals

    The dividing point is marked by a | The | is not part of the string

    Initially, all input is unexamined |x1x2 . . . xn

  • 12

    CS 1622 Lecture 10 34

    Shift-Reduce ParsingBottom-up parsing uses only two kinds of

    actions:

    Shift

    Reduce

    CS 1622 Lecture 10 35

    Shift Shift: Move | one place to the right

    Shifts a terminal to the left string

    ABC|xyz ABCx|yz

    CS 1622 Lecture 10 36

    Reduce Apply an inverse production at the right

    end of the left string If A xy is a production, then

    Cbxy|ijk CbA|ijk

  • 13

    CS 1622 Lecture 10 37

    The Example with ReductionsOnly

    reduce T intT + int |

    E |reduce E T + ET + E |reduce E TT + T |

    reduce T int * Tint * T | + intreduce T intint * int | + int

    CS 1622 Lecture 10 38

    The Example with Shift-Reduce Parsing

    reduce T intT + int |shiftT + | int

    shiftint | * int + intshiftint * | int + int

    shift|int * int + int

    E |reduce E T + ET + E |reduce E TT + T |

    shiftT | + intreduce T int * Tint * T | + intreduce T intint * int | + int

    CS 1622 Lecture 10 39

    A Shift-Reduce Parse inDetail (1)

    + int*int int

    |int * int + int

  • 14

    CS 1622 Lecture 10 40

    A Shift-Reduce Parse inDetail (2)

    + int*int int

    int | * int + int|int * int + int

    CS 1622 Lecture 10 41

    A Shift-Reduce Parse inDetail (3)

    + int*int int

    int | * int + intint * | int + int

    |int * int + int

    CS 1622 Lecture 10 42

    A Shift-Reduce Parse inDetail (4)

    + int*int int

    int | * int + intint * | int + int

    |int * int + int

    int * int | + int

  • 15

    CS 1622 Lecture 10 43

    A Shift-Reduce Parse inDetail (5)

    + int*int int

    T

    int | * int + intint * | int + int

    |int * int + int

    int * T | + intint * int | + int

    CS 1622 Lecture 10 44

    A Shift-Reduce Parse inDetail (6)

    T

    + int*int int

    T

    int | * int + intint * | int + int

    |int * int + int

    T | + intint * T | + intint * int | + int

    CS 1622 Lecture 10 45

    A Shift-Reduce Parse inDetail (7)

    T

    + int*int int

    T

    T + | int

    int | * int + intint * | int + int

    |int * int + int

    T | + intint * T | + intint * int | + int

  • 16

    CS 1622 Lecture 10 46

    A Shift-Reduce Parse inDetail (8)

    T

    + int*int int

    T

    T + int |T + | int

    int | * int + intint * | int + int

    |int * int + int

    T | + intint * T | + intint * int | + int

    CS 1622 Lecture 10 47

    A Shift-Reduce Parse inDetail (9)

    T

    + int*int

    T

    int

    T

    T + int |T + | int

    int | * int + intint * | int + int

    |int * int + int

    T + T |

    T | + intint * T | + intint * int | + int

    CS 1622 Lecture 10 48

    A Shift-Reduce Parse inDetail (10)

    T E

    + int*int

    T

    int

    T

    T + int |T + | int

    int | * int + intint * | int + int

    |int * int + int

    T + E |T + T |

    T | + intint * T | + intint * int | + int

  • 17

    CS 1622 Lecture 10 49

    A Shift-Reduce Parse inDetail (11)

    E

    T E

    + int*int

    T

    int

    T

    T + int |T + | int

    int | * int + intint * | int + int

    |int * int + int

    E |T + E |T + T |

    T | + intint * T | + intint * int | + int

    CS 1622 Lecture 10 50

    The Stack Left string can be implemented by a stack

    Top of the stack is the |

    Shift pushes a terminal on the stack

    Reduce pops 0 or more symbols off of thestack (production rhs) and pushes a non-terminal on the stack (production lhs)

    CS 1622 Lecture 10 51

    Key Issue (will be resolved byalgorithms) How do we decide when to shift or

    reduce? Consider step int | * int + int We could reduce by T int giving T | * int

    + int A fatal mistake: No way to reduce to the

    start symbol E

  • 18

    CS 1622 Lecture 10 52

    Conflicts Generic shift-reduce strategy:

    If there is a handle on top of the stack, reduce Otherwise, shift

    But what if there is a choice? If it is legal to shift or reduce, there is a

    shift-reduce conflict If it is legal to reduce by two different productions,

    there is a reduce-reduce conflict

    CS 1622 Lecture 10 53

    Source of Conflicts Ambiguous grammars always cause

    conflicts

    But beware, so do many non-ambiguous grammars

    CS 1622 Lecture 10 54

    Conflict ExampleConsider our favorite ambiguous

    grammar:

    int|(E)|E * E|

    E + EE

  • 19

    CS 1622 Lecture 10 55

    One Shift-Reduce Parse

    E |reduce E E + EE + E |

    . . .. . .reduce E E * EE * E | + int

    shift|int * int + int

    reduce E intE + int|shiftE + | intshiftE | + int

    CS 1622 Lecture 10 56

    Another Shift-Reduce Parse

    E |reduce E E * EE * E |

    . . .. . .shiftE * E | + int

    shift|int * int + int

    reduce E E + EE * E + E|reduce E intE * E + int |shiftE * E + | int

    CS 1622 Lecture 10 57

    Example Notes In the second step E * E | + int we can either

    shift or reduce by E E * E

    Choice determines associativity of + and * As noted previously, grammar can be

    rewritten to enforce precedence Precedence declarations are an alternative

  • 20

    CS 1622 Lecture 10 58

    Precedence DeclarationsRevisited Precedence declarations cause shift-reduce

    parsers to resolve conflicts in certain ways Declaring * has greater precedence than +

    causes parser to reduce at E * E | + int More precisely, precedence declaration is

    used to resolve conflict between reducing a *and shifting a +

    CS 1622 Lecture 10 59

    Precedence DeclarationsRevisited (Cont.) The term precedence declaration is

    misleading These declarations do not define

    precedence; they define conflictresolutions Not quite the same thing!