Parsing - acm.sjtu.edu.cn · Grammar for Balanced Braces B -> ε B -> ‘{’ B ‘}’ production...
Transcript of Parsing - acm.sjtu.edu.cn · Grammar for Balanced Braces B -> ε B -> ‘{’ B ‘}’ production...
Parsing
Xiao Jia
2013/03/15
1
• RE for balanced braces? – {}
– {{}}
– {{{{}}}}
– …
2
Grammar for Balanced Braces
B -> ε
B -> ‘{’ B ‘}’
3
Grammar for Balanced Braces
B -> ε
B -> ‘{’ B ‘}’
-- OR --
B -> ε | ‘{’ B ‘}’
4
Grammar for Balanced Braces
B -> ε
B -> ‘{’ B ‘}’
L[[ B ]] = ?
5
Grammar for Balanced Braces
B -> ε
B -> ‘{’ B ‘}’
6
Grammar for Balanced Braces
B -> ε
B -> ‘{’ B ‘}’
production rule 产生式
7
Grammar for Balanced Braces
B -> ε
B -> ‘{’ B ‘}’
production rule 产生式
head
8
Grammar for Balanced Braces
B -> ε
B -> ‘{’ B ‘}’
production rule 产生式
head body
9
Grammar for Balanced Braces
B -> ε
B -> ‘{’ B ‘}’
production rule 产生式
terminal 终结符
10
Grammar for Balanced Braces
B -> ε
B -> ‘{’ B ‘}’
production rule 产生式
terminal 终结符
nonterminal 非终结符
11
Example
list -> list ‘+’ digit
list -> list ‘-’ digit
list -> digit
digit -> ‘0’ | ‘1’ | ‘2’ | … | ‘9’
9-5+2
3-1
7 12
Example
list -> list ‘+’ digit
list -> list ‘-’ digit
list -> digit
digit -> ‘0’ | ‘1’ | ‘2’ | … | ‘9’
Terminals: ‘+’ ‘-’ ‘0’ ‘1’ ‘2’ … ‘9’
Nonterminals: list, digit
13
Example
list -> list ‘+’ digit
list -> list ‘-’ digit
list -> digit
digit -> ‘0’ | ‘1’ | ‘2’ | … | ‘9’
Terminals: ‘+’ ‘-’ ‘0’ ‘1’ ‘2’ … ‘9’
Nonterminals: list, digit
Start symbol: list 14
Context-free Grammar (CFG)
1. a set of terminals (tokens) T
2. a set of nonterminals N
3. a set of production rules P
4. a start symbol S ∈ N
G = <T, N, P, S>
15
Languages
Recursively enumerable
Context-sensitive
Context-free
Regular
16
Undecidable
Languages
Recursively enumerable
Context-sensitive
Context-free
Regular
17
Undecidable Chomsky Hierarchy Type-0 Type-1 Type-2 Type-3
Languages
Recursively enumerable
Context-sensitive
Context-free
Regular
18
Undecidable Chomsky Hierarchy Type-0 Type-1 Type-2 Type-3
Parsing Lexical Analysis
Example
{ an bn | n ≥ 1 }
ab
aabb
…
{ an bn cn | n ≥ 1 }
abc
aabbcc
… 19
Languages
Recursively enumerable
Context-sensitive
Context-free
Regular
20
Undecidable Chomsky Hierarchy Type-0 Type-1 Type-2 Type-3
Parsing Lexical Analysis
Semantic Analysis
Languages
Recursively enumerable
Context-sensitive
Context-free
Regular
21
Undecidable Chomsky Hierarchy Type-0 Type-1 Type-2 Type-3
Parsing Lexical Analysis
Semantic Analysis
Turing machine
Languages
Recursively enumerable
Context-sensitive
Context-free
Regular
22
Undecidable Chomsky Hierarchy Type-0 Type-1 Type-2 Type-3
Parsing Lexical Analysis
Semantic Analysis
Turing machine recursive
visibly pushdown
Example
Grammar:
1. S -> S + S
2. S -> 1
3. S -> a
String:
1 + 1 + a
23
Example
Grammar:
1. S -> S + S
2. S -> 1
3. S -> a
String:
1 + 1 + a
24
S -> S + S (1) -> 1 + S (2) -> 1 + S + S (1) -> 1 + 1 + S (2) -> 1 + 1 + a (3)
Example
Grammar:
1. S -> S + S
2. S -> 1
3. S -> a
String:
1 + 1 + a
25
S -> S + S (1) -> 1 + S (2) -> 1 + S + S (1) -> 1 + 1 + S (2) -> 1 + 1 + a (3)
A derivation is a sequence of rule applications that transforms the start symbol into the string
How to determine the next nonterminal to rewrite
• Leftmost derivation: – always the leftmost nonterminal
• Rightmost derivation: – always the rightmost nonterminal
26
27
Leftmost derivation: S -> S + S (1) -> 1 + S (2) -> 1 + S + S (1) -> 1 + 1 + S (2) -> 1 + 1 + a (3)
Rightmost derivation: S -> S + S (1) -> S + a (3) -> S + S + a (1) -> S + 1 + a (2) -> 1 + 1 + a (2)
1. S -> S + S 2. S -> 1 3. S -> a
1 + 1 + a
Ambiguity
• G = a grammar
• L = the language generated by G
• G is ambiguous if there exist two or more derivations for some string S ∈ L
28
Derivations & Parse Trees
• A derivation imposes a hierarchical structure on the string that is derived
29
S -> S + S (1) -> 1 + S (2) -> 1 + S + S (1) -> 1 + 1 + S (2) -> 1 + 1 + a (3)
S
S S
S S
+
+ 1
1 a
Derivations & Parse Trees
• A derivation imposes a hierarchical structure on the string that is derived
30
S
S S
S S
+
+ 1
1 a
parse tree
-- OR --
concrete syntax tree
Parse Tree vs. Syntax Tree
• Parse tree, or concrete syntax tree
• Syntax tree, or abstract syntax tree
31
Parse Tree vs. Syntax Tree
• Parse tree, or concrete syntax tree
• Syntax tree, or abstract syntax tree
32
2 * (3 + 4)
Parse Tree vs. Syntax Tree
• Parse tree, or concrete syntax tree
• Syntax tree, or abstract syntax tree
33
2 * (3 + 4) E
T T
E
*
T T +
2
3 4
( )
syntactic details
OP(*)
NUM(2) OP(+)
NUM(3) NUM(4)
Parse Tree vs. Syntax Tree
• Nodes in a parse tree exactly correspond to terminals and nonterminals in the grammar
• Nodes in an AST correspond to semantic structures in the meaning of the language
34
Parser Semantic Analyzer
token stream parse tree AST ?
Semantic Actions
E -> T ‘+’ T
| T ‘*’ T
T -> integer
| ‘(’ E ‘)’
35
Parser Semantic Analyzer
token stream parse tree AST ?
Semantic Actions
E -> T ‘+’ T { $$ = new Op(‘+’, $1, $3); }
| T ‘*’ T { $$ = new Op(‘*’, $1, $3); }
T -> integer { $$ = new Num($1); }
| ‘(’ E ‘)’ { $$ = $2; }
36
Parser Semantic Analyzer
token stream parse tree AST ?
Questions?
• Readings:
• “Dragon Book”, Ch.1 – Ch.4
• Parsing Techniques: A Practical Guide
by Dick Grune and Ceriel Jacobs
37