COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

44
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner

Transcript of COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

Page 1: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

COMP3190: Principle of Programming Languages

DFA and its equivalent, scanner

Page 2: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 2 -

Outline

DFA & NFA» DFA» NFA» NFA →DFA» Minimize DFA

Regular expression Regular languages Scanner

Page 3: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 3 -

Example of DFA

q1 q2

1

0

0 1

δ 0 1

q1 q1 q2

q2 q1 q2

Page 4: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 4 -

Deterministic Finite Automata (DFA)

5-tuple:» Q: finite set of states» Σ: finite set of “letters” (alphabet)» δ: Q × Σ → Q (transition function)

» q0: start state (in Q)

» F : set of accept states (subset of Q) Acceptance: Given an input string , it is

consumed with the automata in a final state.

Page 5: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 5 -

Another Example of a DFA

S

q1

q2

r1

r2

a b

a

ab

b

b

a b

a

Page 6: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 6 -

Outline

DFA & NFA» DFA» NFA» NFA →DFA» Minimize DFA

Regular expression Regular languages Context free languages &PDA Scanner Parser

Page 7: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 7 -

Non-deterministic Finite Automata (NFA)

Transition function is different δ: Q × Σε → P(Q)

P(Q) is the powerset of Q (set of all subsets) Σε is the union of Σ and the special symbol ε

(denoting empty)String is accepted if there is at least one path leading to an accept state, and input consumed.

Page 8: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 8 -

Example of an NFA

q1 q2 q3 q4

0, 11 0, ε 1

0, 1

δ 0 1 ε

q1 {q1} {q1, q2}

q2 {q3} {q3}

q3 {q4}

q4 {q4} {q4}

What strings does this NFA accept?

Page 9: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 9 -

Outline

DFA & NFA» DFA» NFA» NFA →DFA» Minimize DFA

Regular expression Regular languages Context free languages &PDA Scanner Parser

Page 10: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 10 -

Converting an NFA to a DFA

For set of states S, - closure(S) is the set of states that can be reached from S without consuming any input.

For a set of states S, Sc is the set of states that can be reached from S by consuming input symbol c.

Each set of NFA states corresponds to one DFA state (hence at most 2n states).

Page 11: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 11 -

-closure({1})={1 , 2}=I Ia= -closure({5,4,3})

J={5 , 4 , 3}

-closure(J)= -closure({5 , 4 , 3})

={5 , 4 , 3 , 6 , 2 , 7 , 8} Ja={3}

6

1 a 2 3

4

5

7

8

a

a

Page 12: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 12 -

I Ia Ib

{X,5,1} {5,3,1} {5,4,1}

{5,3,1} {5,2,3,1,6,Y} {5,4,1}

{5,4,1} {5,3,1} {5,2,4,1,6,Y}

{5,2,3,1,6,Y} {5,2,3,1,6,Y} {5,4,6,1,Y}

{5,4,6,1,Y} {5,3,6,1,Y} {5,2,4,1,6,Y}

{5,2,4,1,6,Y} {5,3,6,1,Y} {5,2,4,1,6,Y}

{5,3,6,1,Y} {5,2,3,1,6,Y} {5,4,6,1,Y}

X Y

5 1

4

2

3

6

a

b

a

b

a

b

a

b

Page 13: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 13 -

I a b0 1 21 3 22 1 43 3 44 6 55 6 56 3 4

0

1

2

3

5

4

6

aa b

bb

a

ba aba

b

a

b

Page 14: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 14 -

Excercise

1

2

3start

a

ab

a

4

65

ε

ε

ε

a

b

b

A B

4

6a

astart

Ca,b

a,bb

Page 15: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 15 -

Class Problem

0 1

4

2

6

3

5

97ε ε

ε

ε

ε

ε

ε

ε

a

a

b

8 b

Convert this NFA to a DFA

Page 16: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 16 -

Outline

DFA & NFA» DFA» NFA» NFA →DFA» Minimize DFA

Regular expression Regular languages Scanner

Page 17: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 17 -

State Minimization

Resulting DFA can be quite large» Contains redundant or equivalent states

2

5

b

start1

3

b

ab

aa

4

b

a

1 2 3start

a a

bb Both DFAs acceptb*ab*a

Ca,b

a,b

Ca,b

a,b

Page 18: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 18 -

Obtaining the minimal equivalent DFA

Initially two equivalence classes: accept and nonaccept states.

Search for an equivalence class C and an input letter a such that with a as input, the states in C make transitions to states in k>1 different equivalence classes.

Partition C into k classes accordingly Repeat until unable to find a class to partition.

Page 19: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 19 -

Minimization Example

19

Split into two teams.

ACCEPT

vs.

NONACCEPT

Page 20: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 20 -

Minimization Example

20

0-label doesn’t split

up any teams

Page 21: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 21 -

Minimization Example

21

1-label splits up

NONACCEPT's

Page 22: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 22 -

Minimization Example

22

No further splits. HALT!

Start team

contains

original

start

Page 23: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 23 -

Minimization Example.End Result23

States of the minimal automata are

remaining teams. Edges are

consolidated across each team. Accept

states are break-offs from

original ACCEPT team.

Page 24: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 24 -

Minimization Example.Compare24

100100101

10000

Page 25: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 25 -

a

e

b

f

c

g

d

h

0

0

0

0

0

0

001

1

1 1

1

1

1

1

Class Exercise

Page 26: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 26 -

Exercise

How to minimize the following DFA?

2

5

b

start1

3

b

ab

aa

4

b

a

1 2 3start

a a

bb

Both DFAs acceptb*ab*a

Page 27: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 27 -

Outline

DFA & NFA Regular expression Regular languages Scanner

Page 28: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 28 -

Regular Expressions

R is a regular expression if R is “a” for some a in Σ. ε (the empty string). member of the empty language. the union of two regular expressions. the concatenation of two regular expr. R1

* (Kleene closure: zero or more repetitions of R1).

Page 29: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 29 -

Examples of Regular Expressions

{0, 1}* 0 all strings that end in 0{0, 1} 0* string that start with 1 or 0 followed by zero or more 0s.{0, 1}* all strings{0n1n, n >=0} not a regular expression!!!

Page 30: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 30 -

Regular Expressions in Java Ex: pattern match. Is text in the set described by the pattern? public class RE {

public static void main(String[] args) { String pattern = args[0]; String text = args[1]; System.out.println(text.matches(pattern)); }}

% java RE "..oo..oo." bloodroottrue

% java RE "[$_A-Za-z][$_A-Za-z0-9]*" ident123true

% java RE "[a-z]+@([a-z]+\.)+(edu|com)" [email protected]

Page 31: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 31 -

Regular Expression Notation in Java a: an ordinary letter ε: the empty string M | N: choosing from M or N MN: concatenation of M and N M*: zero or more times (Kleene star) M+: one or more times M?: zero or one occurence [a-zA-Z] character set alternation (choice) . period stands for any single char exc. newline

Page 32: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 32 -

Converting a regular expression to a NFA

Empty string

Single character

union operator

Concatenation

Kleene closure

Page 33: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 33 -

Regular expression→NFA

Language: Strings of 0s and 1s in which the number of 0s is even

Regular expression: (1*01*0)*1*

Page 34: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 34 -

NFA → DFA

Initial classes:{A, B, E}, {C, D}

No class requires partitioning!

Hence a two-stateDFA is obtained.

Page 35: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 35 -

Minimize DFA

Page 36: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 36 -

Outline

DFA & NFA Regular expression Regular languages Scanner

Page 37: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 37 -

Regular language

a formal language » a set of finite sequences of symbols from a

finite alphabet it can be generated by a regular grammar

Page 38: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 38 -

Regular Grammar

Later definitions build on earlier ones Nothing defined in terms of itself (no

recursion)

Regular grammar for numeric literals in Pascal:digit → 0|1|2|...|8|9unsigned_integer → digit digit*unsigned_number → unsigned_integer (( . unsigned_integer) | ε ) (( e (+ | - | ε ) unsigned_integer ) | ε )

Page 39: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 39 -

Important Theorems

A language is regular if a regular expression describes it.

A language is regular if a finite automata recognizes it.

DFAs and NFAs are equally powerful.

Page 40: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 40 -

Outline

DFA & NFA Regular expression Regular languages Scanner

Page 41: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 41 -

Scanning

Accept the longest possible token in each invocation of the scanner.

Implementation.» Capture finite automata.

Case(switch) statements. Table and driver.

Page 42: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 42 -

Scanner for Pascal

Page 43: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 43 -

Scanner for Pascal(case Statements)

Page 44: COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

- 44 -

Scanner Generators

Start with a regular expression. Construct an NFA from it. Use a set of subsets construction to obtain an

equivalent DFA. Construct the minimal equivalent DFA.