Surinder Kumar Jain, University of Sydney. Automaton DFA NFA Ε-NFA CFG as a DFA Equivalence ...
-
Upload
allan-mckenzie -
Category
Documents
-
view
233 -
download
0
Transcript of Surinder Kumar Jain, University of Sydney. Automaton DFA NFA Ε-NFA CFG as a DFA Equivalence ...
Automaton DFA NFA Ε-NFA CFG as a DFA Equivalence Minimal DFA
Expressions Definition Conversion from/to Automaton
Regular Langauges Pumping Lemma – proving regularness Closures Equivalence
A system with many states Can transition from one state to another Usually caused by external input Set of states is finite System is in one state at any given time
Mathematical Definition of a DFA A = (Q, Σ,δ, q0,F) Q : States, DFA is in one of these finite states at
any time. Σ : Input symbols, DFA changes its state from
one state to another state on consuming an input symbol.
δ : Transition function. Given a state and an input symbols, gives the next
DFA state Function over QxΣ -> Q.
q0 : Initial DFA state F : Accepting states. Once DFA reaches one of
these states, it may not accept any more input symbols.
Q = { waiting, pending, rejected, approved, paid }
Σ = {receive, reject, accept, pay }
δ : (waiting -> receive -> pending), (pending -> reject -> rejected), (pending -> accept -> accepted), (accepted -> pay -> paid)
q0 : {waiting}
F : { rejected, paid }
Paid
Paid
Waiting
Pending
Accepted Paid
Rejected
start
receive
reject
accept pay
Q = { waiting, pending, rejected, approved, paid }Σ = {receive, reject, accept, pay }δ : (waiting -> receive -> pending), (pending -> reject -> rejected), (pending -> accept -> accepted), (accepted -> pay -> paid) q0 : {waiting} F : { rejected, paid }
Set of alphabets Concatenation (joining) Strings A subset of strings is a language
A DFA defines a languageAlphabet set is the set of input symbolsConcatenation - one symbol follows anotherAcceptance – sequence of symbols takes
DFA from start state to one of the accepting states
Five-tuple like a DFA, (Q, Σ,δ, q0,F) Transition function returns a set not one
state Several outgoing arcs with same symbol In several states at the same time Language of NFA
Any NFA language can be described by some DFA
Adding non-determinism does not give any thing more
Why use NFAs then : Easier to make for some languages May have fewer states and less complex
Algorithm to convert NFA to DFA For n state NFA,DFA may have up to 2n states Can throw away inaccessible states Observation : DFA has practically the same
number of states as NFA though it often has more transitions
For an NFA, N = {Q, Σ, δ, q0, F}, Construct the DFA, D = {Qd, Σ, δd, {q0}, Fd}
Qd = Powerset of Q δd(S, a) = Up in S δ(p,a) for every S in Qd. Fd = S : S is subset of Q and S has an accepting
state of NFA
DFA operates on one state at a time, NFA operates on sets of states.
Given a state, NFA gives a set of new states Make all possible sets of DFA states as NFA states Transit from one set of states to a new set of all possible state
set Any set with an accepting state is the accepting state in NFA
O(2n) (number of subsets of a set) Efficient algorithm
Do not construct the entire power setStart with start stateOnly construct subsets that can reach an
accepting state from the start stateThe number of states in DFA is much less
than 2n. DFA has practically the same number of
states as NFA though it often has more transitions
Includes ε (the empty string, not in alphabet set) as a transition
ε is identity in concatenation a.ε = ε.a = a for all a Spontaneous transition without an input
An ε-NFA language can be described by some NFA
Every NFA can be described by some DFA
Adding ε transition does not give any thing more
Why use ε-NFAs then :Easier to make for some languagesUseful in proving equivalence of languages
Conversion aims to remove ε transitions Define a new set of states
ε are contained inside the setNo ε arc leaves or enters the new set of states
Epsilon closure (eclose)For a state, set of all states reachable
spontaneously Follow the ε arcs recursively and include reachable
states in the epsilon closure
For an ε-NFA, N = {Q, Σ, δ, q0, F}, Construct the DFA, D = {Qd, Σ, δd, {eclose(q0)}, Fd}
Qd = { eclose(q) | q = eclose(q) and q in Q } δd(S, a) = Up in S δ(p,eclose(a)) for every S in Qd. Fd = S : S is subset of Q and S has an accepting
state of NFA
DFA operates on one state at a time, ε-NFA operates on sets of states with no ε transition leaving the set
Make all eclose sets as DFA states Transit from one set of states to a new set of all eclose state set Any set with an accepting state is the accepting state in NFA
An imperative program can be represented as a Control Flow Graph (CFG) with statements at nodes and predicates at edges
It can be converted into a CFG with both statements and predicates at edges by pushing node statements up incoming
edges Such a CFG is a DFA
Program points are States Statements are input symbols that change
program state from program point to point
Algebraic expression to denote languages
Composed of symbols “ε”, “Ø”, “+”, “*”, “.”, “(“, “)” and alphabets
The language is generated using rules :L(ε) = empty set L(Ø) = empty set L(a) = a for all alphabets a L(p+q) = L(p) U L(q) L(p.q) = { p’.q’ | p’ in L(p) & q’ in L(q) } L(p*) = { qn | q in L(p) and n >= 0 }, q0= ε, qk=q.qk-1
a+b.cThe language generated is :{ a, b.c }
a.b.c*.dthe language generated is :{ a.b.d, a.b.c.d, a.b.c.c.d, a.b.c.c.c.d, … }
A finite way to express an infinite language
DEFINITION
Two regular expression (or automaton) are EQUAL if they both generate same languages
Thus (a.b)* + (b.a)* + a.(b.a)* + b.(b.a)*= (ε + b).(a.b)*.(ε+a)
p + q = q + p (p + q) + r = p + (q + r) (p.q).r = p.(q.r) Ø + p = p + Ø = p ε.p = p.ε = p Ø.p = p.Ø = Ø p.(q=r) = p.q + p.r (p + q).r = p.r + q.r p + p = p (p*)* = p*
Ø* = ε ε* = ε p.p* = p*.p (p + q)* = (p*.q*)*
Every language defined by a finite automaton is also
defined by some regular expressiondefined by a regular expression is also
defined by some DFA
Hopcroft’s formula Rij
(k) = Rij(k-1)+Rik
(k-1).(Rkk(k-1))*.Rkj
(k-1)
Rij(n) is the regular expression of all paths from
i to j. (n is the number of states) States are sorted in some order and numbered
1 to n Rij
(k) is regular expression of all paths from i to j passing thru nodes whose sort order is less than k
Computed for all i,j for k=0, then k=1,…,k=n Rs,f1
(n)+…+Rs,fk(n) is the regular expression of
the DFA s is the start state, f1,…,fk are accepting states, n is
the number of states.
Hopcroft formula is O(n34n), n3 to compute the table and 4n as size of regular expression grows by 4
every time. In practice it is close to O(n3)
By simplifying the regular expression at every step and
using judicious algorithm avoiding recomputation of Rkk
(k) Most DFAs have almost n and not 2n
accessible states A faster state elimination method close
to O(n2) is also available
Regular expression is converted to ε-NFA ε-NFA can the be converted to NFA and to DFA RE to ε-NFA conversion rules :
ε -> One edge (two state) DFA with ε transition Ø -> Two state DFA with no edges a -> Two state with “a” transition + -> A new start/accept statejoining two arguments of + in parallel . -> Accept of first is start of second * -> An ε edge joining star/accept of argument and a new start/accept state
Convert resulting ε-NFA to a DFA
Augment regular expression r to (r).# Position number for each occurrence of
alphabet Compute for each node of syntax tree
nullable (ε in the language)firstpos (set of possible first alphabets) lastpos (set of possible last alphabets)
Compute for each position followpos (set of possible next alphabet
after this position) Construct the DFA
Unix text search, search matching patterns (grep)
Lexical/Parser analysisParse text against a regular expressionfind set of first tokens at this expression
rootfind set of last tkens at this expression rootcan the expression at this root be null setfind set of next tokens after an alphabet
position in a regular expression Efficient search of patterns in very large
repository (web text search)
DEFINITION
A language (a set of strings) is defined to be a regular language if it can be defined by a finite automaton
by a DFA orby an NFA orby an ε-NFA or by a regular expression
Four different ways to describe a regular language
If L is a regular language then there exists integer n such that for every string w in Lwe can break w into x, y, z such that w=x.y.z
y ε |x.y| =< n x.yk.z is in L (for all k >= 0)
Proof based on For a DFA of length n any string of length > n must revisit a state
Used to prove that a language is not regular
Language is a set of string over finite alphabets
Language operators : Union of two languages L(A B) = L(A) L(B) - re Intersection Concatenation L(A.B) = { a.b | a in A, b in B} Kleene Closure L(A*) = { an | a in A, n >= 0 }
a0 = ε for all a and an = an-1 Compliment L(A’) = { a | a not in A } (with respect to some
overall alphabet set) - dfa Difference L(A-B) = L(A) – L(B) - dfa switch q0 F Reversal L (A) = { ak.ak-1…a1 | a1…ak-1.ak in A } Homomorphism – replace an alphabet with another regular
expression Inverse homomorphism
Is the language described empty? Is a particualr string in the described
language? Do two different of languages actually
describe the same language?
Decision properties may require conversion between various forms.
Can the conversion be done in reasonable time?Conversion Complexity
Computing ε closures O(n3) Warshall’s O(n)
Subset construction O(2n)
NFA to DFA O(n32n) (In practice O(n3s)
DFA to NFA conversion O(n)
NFA/DFA to Regular Expression
O(n34n) (worst case) (Actual is much less)
Regular Expression to εNFA
O(n)
Regular Expression to NFA
O(n3)
Regular Expression to DFA
O(n34n^32^n)
Equivalence of two states States p and q in an automaton are
Defined to be equivalent ifFor all input strings applied at state p or qp ends up in an accepting state if and only ifq also ends up in an accepting state
The accepting state reached by p does not have to be same accepting state as that reached by q
If two states p and q are equivalent we can combine them together into a
single state it wont affect the language accepted by
the DFA This process of combining states
together is called Minimization Table-filling algorithm can find if two
states are equivalent or not. Complexity O(n2)
Non-equivalent pairs are distinguishable
Minimum DFA is unique Eliminate all states not reachable from start Determine which states are equivalent Partition states into blocks of equivalent states Equivalence is transitive Thus no state is in two blocks
Equivalence of two Regular Languages Convert them into their minimum DFAs and check for isomorphism
Union method Make a minimum DFA of the union of the two Start state of the two original DFAs must be
equivalent if and only if DFAs are equivalent