Deterministic Finite Automaton M = (K; ; ;s;A s2K start state...

21
Finite Automata - Deterministic Finite Automata Deterministic Finite Automaton (DFA) (or Finite State Machine) M =(K, Σ, δ, s, A), where K is a finite set of states Σ is an input alphabet s K is a distinguished state called the start state A K is a set of accepting states δ is a transition function, mapping K × Σ K State Diagram: Graphical representation of a DFA Transition table can be used to represent transitions Rows indexed by states Columns indexed by alphabet Contents are state transitioned to Example: 0 1 0 5 1 1 5 2 2 3 3 3 4 4 4 5 5 5 5 5 Configuration of DFA M is element of K × Σ * Represents current state and remaining input Initial configuration of M on input w denoted (s M ,w) 1

Transcript of Deterministic Finite Automaton M = (K; ; ;s;A s2K start state...

Page 1: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Deterministic Finite Automata

• Deterministic Finite Automaton (DFA) (or Finite State Machine)M = (K,Σ, δ, s, A), where

K is a finite set of statesΣ is an input alphabets ∈ K is a distinguished state called the start stateA ⊆ K is a set of accepting statesδ is a transition function, mapping K × Σ→ K

• State Diagram: Graphical representation of a DFA

• Transition table can be used to represent transitions

– Rows indexed by states

– Columns indexed by alphabet

– Contents are state transitioned to

– Example:0 1

0 5 11 5 22 3 33 4 44 5 55 5 5

• Configuration of DFA M is element of K × Σ∗

– Represents current state and remaining input

– Initial configuration of M on input w denoted (sM , w)

1

Page 2: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Deterministic Finite Automata (2)

• Yields-in-one-step relation relates configurations to their immediate succes-sors

– Denoted `M (for DFA M)

– Let c ∈ Σ, w ∈ Σ∗, qi, qj ∈ K. Then

(qi, cw) `M (qj, w) iff ((qi, c), q2) ∈ δ– I.e., c is legal transition symbol from qi to qj

• Yields is reflexive, transitive closure of `M– Denoted `∗M– Given configurations Ci, Cj

Ci `∗M Cj indicates that M can transition between Ci and Cj in 0 ormore steps

• Computation by M is finite sequence of configurations C0, C1, ..., Cn, n ≥ 0,where

C0 is initial configurationCn is of the form (q, ε), q ∈ KM

C0 `M C1 `M ... `M Cn

• Given string w ∈ Σ∗

– M accepts w iff (s, w) `∗M (q, ε), q ∈ AM

– (q, ε) called accepting configuration

• Given string w ∈ Σ∗

– M rejects w iff (s, w) `∗M (q, ε), q /∈ AM

– (q, ε) called rejecting configuration

• Language accepted by M denoted L(M)

• Operation (summary)

1. DFA M begins operation in start state

2. Symbols of input string w read one-at-a-time

– Each causes a transition to some state in M

3. After all symbols have been consumed, w is accepted if M is in an acceptingstate; otherwise, w is rejected

2

Page 3: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Deterministic Finite Automata (3)

• Theorem 5.1

– Statement: Every DFA M halts after |w| steps given input w

– Proof: (See p 60)

• State diagram variations

1. If several symbols transition between the same pair of states, represent asa single arc labeled with a comma-separated list of the symbols

2. If the majority of alphabet symbols cause a transition, represent those thatdo using set difference

– E.g., Σ− {c1, c2, ..., cn}3. Dead state

– Dead state is rejecting state with no transitions to other states

– Usually denoted d

– Can be eliminated from state diagrams

∗ If no labeled transition from a state for a given symbol, assume tran-sition is to dead state

3

Page 4: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Designing DFAs

• Need to id 2 things about strings w to be input to a DFA:

1. Id properties of prefixes of w that affect the result

– These properties translate into states

2. Id categories of strings

– Many strings will drive DFA to a particular state, which then leads toa fixed result

– Goal is to id these categories of strings

– They will generate a state, which can be named for the cluster for read-ability

4

Page 5: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Nondeterminism

• Given an algorithm A and a set of inputs

– A is deterministic if it always performs the same computation on theseinputs

– Otherwise, A is nondeterministic

• A nondeterministic algorithm is one that can ”guess” what to do next

• For a given problem, a nondeterministic algorithm may produce different solu-tions at different times

• Nondeterministic algorithms are sometimes easier to design than deterministicones

5

Page 6: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Nondeterministic FAs

• Nondeterministic Finite Automata (NFAs) are FAs in which transitionsare ”relaxed” (See below)

• NFA M = (K,Σ,∆, s, A) where

K is a finite set of statesΣ is an input alphabets ∈ K is a distinguished state called the start stateA ⊆ K is a set of accepting states∆ is a transition relation, a finite subset of (K × (Σ ∪ {ε}))×K

• M accepts w iff one of its computations accepts w

• M rejects w iff none of its computations accepts w

• Language accepted by M denoted L(M)

• NFAs differ from DFAs as follows:

– DFAs are so-called because each symbol results in exactly one transitionfrom a given state

– In NFAs

1. Multiple transitions may occur from a given state for a single symbol

2. Transitions may consume no input symbol

∗ These called ε-transitions

3. They correspond to guesses by M

4. There may be no transition from a given state for a symbol

∗ If input still remains, this results in rejection

6

Page 7: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Nondeterministic FAs (2)

• Computation of an NFA can be considered from 2 perspectives:

1. As tree search

– Each node of tree represents a legal configuration

– Children represent configurations reachable in one step from the parentconfiguration

– Leaves represent configurations in which the input string has been con-sumed

∗ If any leaf represents an accepting state, the string is accepted

2. As parallel processing

– From a state, all transitions executed in parallel

– NFA moves between sets of states

∗ Given a set of states, the next set consists of all those states reachablefrom those in the initial set, based on the current input symbol

– If the final set contains an accepting state, the input string is accepted

• NFAs useful in following situations:

1. Languages that require complex DFAs

– ε-transitions often enable a much simpler NFA

2. Unions of languages

– Build a DFA for each language

– NFA constructed from a start state with ε-transitions to the start statesof each participating DFA

3. Pattern/substring matching

– Have a set of states that read the prefix

– Have an ε-transition to the pattern

– Have a set of states that read the remainder

4. Creating complex DFAs

– NFAS usually easier to create

– Then, convert to DFA (see later notes)

7

Page 8: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Analyzing NFAs

• Given an NFA M, want to identify L(M)

• Can do so using 2 approaches discussed previously:

1. Perform depth-first search of all paths through search tree

2. Trace all paths in parallel

• Following discussion deals with parallel trace

• Define function eps as mapping from Km → P(Km)

– I.e., eps(q) maps from state q to all states reachable from q via ε-transitions:

eps(q) = {p ∈ K : (q, w) `∗M (p, w)}– To compute eps(q):

stateSet eps (state q, delta)

{

stateSet result = {q};

push(q, stack);

while (!empty(stack)) {

p = pop(stack);

for (each ((p, EPSILON), r) in delta)

if (r !in result) {

result = result + {r}; //’+’ is union operator

push(r, stack);

}

return result;

}

8

Page 9: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Analyzing NFAs (2)

• To simulate parallel computation of an NFA:

boolean nfaSimulate (MFA M, string w)

{

stateSet currentState = eps(s);

while (length(w) > 0) {

stateSet nextState = NULL;

char c = getNextSymbol(w);

for (each state q in currentState)

for (each ((q, c), p) in M.delta)

nextState = nextState + eps(p);

}

for (each state q in currentState)

if (q in M.A)

return TRUE;

return FALSE;

}

9

Page 10: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Nondeterminism and Implementation

• 2 possible approaches to implementing nondeterminism:

1. choose (action1;;

action2;;

...

actionn;;)

– Each action produces a solution or FALSE

– Semantics:

∗ If any action produces a solution, choose will halt and return a solu-tion

∗ Otherwise,

· If all actions halt, choose halts (returning FALSE)

· If any action fails to halt, choose will fail to halt

– Need a methodical way of selecting actions

2. choose (x from S: P(x))

– S is a set of values (finite or infinite)

– Semantics:

∗ If P (x) produces a solution for any x, choose will halt and return asolution

∗ Otherwise,

· If P (x) halts for all x, choose halts (returning FALSE)

· If any computation of P (x) fails to halt, choose will fail to halt

– Need a methodical way of selecting x ∈ S

• Can associate probabilities with choices

– Options with higher probabilities are more likely to be chosen than thosewith lower probabilities

– Useful when have a priori knowledge of problem domain

10

Page 11: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Equivalence of DFAS and NFAs

• Theorem 5.2

– Statement: For every DFA that accepts L, there is an equivalent NFA thataccepts L

– Proof: (See p74)

• Theorem 5.3

– Statement: For every NFA that accepts L, there is an equivalent DFA thataccepts L

– Proof: (By construction)

∗ Given: NFA M = (K,Σ,∆, s, A)

∗ Create M ′ = (K ′,Σ, δ′, s′, A′), where

K ′ contains 1 state for each element of Ps′ = eps(s)A′ = {Q ⊆ K ′ : Q ∩ A 6= ∅}δ′(Q, c) =

⋃{eps(p) : ∃q ∈ Q((q, c, p) ∈ ∆)}∗ Note:

1. In most cases, only a small subset of K ′ are actually needed

2. Accepting states of A′ are those that contain states from A

11

Page 12: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Equivalence of DFAS and NFAs (2)

– Construction algorithm:

DFA nfaToDFA (NFA M)

{

for (each state q in M.K)

compute eps(q, M.delta);

stateSet s’ = eps(s, delta)

//Compute delta’

setOfStateSets activeStates = {s’};

delta’ = NULL;

while (there is a Q in activeStates that has not been processed)

for (each symbol c in M.sigma) {

stateSet newState = NULL;

for (each state q in Q) {

for (each state p where ((q, c), p) is in Delta)

newState = newState + eps(p, delta);

delta’ = delta’ + {((Q, c), newState)};

if (newState !in activeStates)

activeStates = activeStates + {newState};

}

}

K’ = activeStates;

A’ = {Q in K’: Q % M.A <> NULL}; //’%’ is intersection operator

return M = (K’, sigma, delta’, s’, A’);

}

12

Page 13: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Implementation

• Can implement FAs in several ways

1. In hardware

2. Hardcoded

3. Simulation via an interpreter

– Most general approach, discussed below

– DFAs and NFAs treated separately

• Implementing DFAs

– Algorithm:

boolean dfaSim (DFA M, string w)

{

symbol c;

state st = M.s;

while (length(w) > 0) {

c = getNextSymbol(w);

st = M.delta(st, c);

}

if (st in M.A)

return TRUE;

else

return FALSE;

}

– Run time ∈ O(|w|), assuming transition lookup ∈ O(1)

• Implementing NFAs

1. Convert NFA to DFA

– Then run above simulator on resulting DFA

– Conversion is most expensive operation:O(2k), where k is number of states

2. Simulate parallel execution of NFA

– See earlier algorithm

– Only generate states as they are needed, rather than generating entireDFA

13

Page 14: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Generating a Minimal DFA for a Regular Language

• From an implementational point of view, want the smallest DFA that canaccept a given language

• Minimal DFA M is one such that no other DFA M ’, where L(m) = L(M ′)has fewer states than M

• Questions with important ramifications:

1. For a given language, can a minimal DFA be found?

2. Is a minimal DFA unique?

3. How can it be determined whether a given DFA is minimal?

4. Given a DFA, how is a minimal equivalent constructed?

• Construction based on concept of state clusters

• Indistinguishable strings: Given strings x, y

– x, y are indistinguishable wrt language L iff

∀z ∈ Σ∗(either both xz and yz ∈ L, or neither xz and yz ∈ L)

– Denote indistinguishability as x ≈L y

– Strings that are not indistinguishable are distinguishable

– ≈L is an equivalence relation:

1. Reflexive: x ≈L x

2. Symmetric: x ≈L y → y ≈L x

3. Transitive: x ≈L y, y ≈L z → x ≈L z

• Equivalence classes

– Equivalence classes denoted using square brackets:

∗ [n], where n represents a numbered class

∗ [s], where s represents a string in the class

∗ [logical expression], which describes the class

– Every string in L belongs to exactly one equivalence class

– To id equivalence classes

1. Generate strings from shortest to longest, starting with ε

2. For each newly generated string, ask whether it belongs to an existingEC, or whether a new EC must be created for it

3. Continue until a pattern emerges

14

Page 15: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Generating a Minimal DFA for a Regular Language (2)

– Note that

1. ε belongs to some EC, which corresponds to the start state of the minimalDFA

2. No EC can contain both strings ∈ L and strings 6∈ L3. More than 1 EC may contain strings that are in L

4. Exactly 1 EC corresponds to the dead state

• Containment: State q of DFA M contains string w if M is in state q afterreading w

15

Page 16: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Generating a Minimal DFA for a Regular Language (3)

• Theorem 5.4

– Statement: ≈L imposes a lower bound on the minimal number of states ofa DFA for L. Let L be a regular language and M be a DFA that acceptsL. The number of states in M ≥ the number of ECs of ≈L.

– Proof: See p 86.

• Theorem 5.5

– Statement: There exists a unique minimal DFA for every regular language.Let L be a regular language over alphabet Σ. There is a DFA M thataccepts L and has exactly n states, where n = the ECs of ≈L. Any otherDFA that accepts L must either have more states than M , or n states thatare equivalent to those of M . The number of states in M ≥ the number ofECs of ≈L.

– Proof: By construction. Create M = (K,Σ, A, δ) as follows.

1. Generate n ECs of M

2. Create one state for each EC

3. Set K to this set of states

4. S = [ε]

5. A = {[x] : x ∈ L}6. δ([x], a) = [xa]

∗ Must prove the following:

1. K is finite

2. δ is a function

3. L = L(M)

4. M is minimal

5. No other DFA with n states accepts L

∗ Proof of above: See pp 87 - 88

• Theorem 5.6: Myhill-Nerode Theorem

– Statement: A language is regular iff the number of ECs of ≈L is finite

– Proof: See p 90

16

Page 17: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Minimizing an Existing DFA

• Previous discussion was concerned with creating a minimal DFA from scratch

• Here the concern is finding a minimal DFA equivalent to an existing DFA

• Two approaches:

1. Iteratively collapse redundant states until cannot collapse any further

2. Partition states into accepting and rejecting

– Iteratively subdivide states based on input characters until no more dis-tinctions can be made

– This is approach described below

• Algorithm based on state equivalence

– States p and q are equivalent iff, for all strings w ∈ Σ∗, either w drivesM to accepting states from both p and q, or it drives M to rejecting statesfrom both p and q

– Denoted p ≡ q

– Series of equivalence relations denoted ≡n, where n ≥ 0

∗ p ≡n q iff p and q produce the same outcome for all strings of length n

∗ Formally,

· p ≡0 q iff p and q are both accepting or rejecting

· For n ≥ 1, p ≡n q iff p ≡n−1 q and ∀a ∈ Σ(δ(p, a) ≡n−1 δ(q, a))

• Algorithm overview

1. Algorithm starts by constructing ≡0

– This partitions K into 2 sets

2. Then create ≡1, ≡2, ...

– For each case, examine pairs of states in each class wrt every element inΣ

– For any symbol that drives states to 2 different results, partition statesinto 2 sets

3. Halt when no differences id’d

17

Page 18: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Minimizing an Existing DFA (2)

• Algorithm

DFA minDFA (DFA M)

{

classes = {M.A, M.K - M.A};

do {

newClasses = NULL;

for (each EC e in classes)

if (e contains > 1 state) {

for (each state q in e)

for (each c in sigma)

determine which set of classes q transitions to on c;

for (each state p in e - q)

for (each c in M.sigma)

if (p transitions to a different class than q on c)

if (new state already created for this

transition on this pass)

add p to new class;

else {

create new class for p;

insert new class into newClasses;

}

}

classes = newClasses;

} while (newClasses <> NULL);

for (each q in M.K) //construct deltaMprime

for (each c in M.sigma)

if (M.delta(q, c) = p)

deltaMprime(\q], c) = [p];

return (classes, sigma, deltaMprime[M.s],

{[q: elements of q in M.A]});

18

Page 19: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Minimizing an Existing DFA (3)

• Alternative Algorithm

– Every pair of states has 2 associated structures:

1. D[i, j]: Indicates whether qi distinguishable (1) from q2 or not (0)

2. S[i, j]: Holds a set of indices whose distinguishability depends on thatof qi and qj∗ Consider

∗ If qi and qj are known to be distinguishable when qm and qn areexamined, then qm and qn are distinguishable

∗ If qi and qj are not distinguishable when qm and qn are examined,then qm and qn are added to S[i, j] because if later qi and qj are foundto be distinguishable, then so should qm and qn

19

Page 20: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Minimizing an Existing DFA (4)

DFA minDFA2 (DFA M)

{

for (every state pair qi, qj, i < j) {

D[i, j] = 0;

S[i, j] = NULL;

}

for (every i, j where i < j)

if (((qi in M.A) and (qj in M.K - M.A)) OR

((qj in M.A) and (qi in M.K - M.A)))

D[i, j] = 1;

for (every i, j where i < j)

if (D[i, j] == 0) {

for (each c in M.sigma)

if ((((qi, c) qm) AND ((qj, c) qn) in M.delta) AND

((D[m, n] = 1) OR (D[n, m] = 1))

dist(i, j);

else

for (each c in M.sigma) {

qm = M.delta(qi, c);

qn = M.delta(qj, c);

if ((m < n) AND [i, j] != [m, n]

S[m, n] = S[m, n] + [i, j];

else if ((m > n) AND [i, j] != [m, n]

S[m, n] = S[n, m] + [i, j];

}

}

}

void dist(i, j)

{

D[i, j] = 1;

for (each [m, n] in S[i, j])

dist(m, n);

}

20

Page 21: Deterministic Finite Automaton M = (K; ; ;s;A s2K start state Kdjmoon/automata/automata-notes/...Nondeterministic algorithms are sometimes easier to design than deterministic ones

Finite Automata - Canonical Form

• Canonical form is a standard representation

– If 2 objects are equivalent, they will have the same canonical form

– Advantage of canonical forms is that they can be used to test 2 objects forequivalence

• Minimization algorithm can be used to create a canonical form for a DFA

– A minimal DFA for language L(M) is unique, except possibly for statenames

– If normalize state names, have a canonical form

• Algorithm

DFA createCF (FA M)

{

M’ = nfaToDFA (M); //convert NFA to equivalent DFA

M’’ = minDFA(M’); //convert to equivalent minimal DFA

q0 = M’’.s;

named = {s};

push(s, stateStack);

k = 1;

while(notEmpty(stateStack)) {

q = pop(stateStack);

for (each c in M’’.sigma) {

p = M’’.delta(q, c);

if ((p not NULL) AND (p not named)) {

rename p as qk;

named = named + p;

push(p);

k++;

}

}

}

return M’’;

}

21