Deterministic Finite Automaton M = (K; ; ;s;A s2K start state...
Transcript of Deterministic Finite Automaton M = (K; ; ;s;A s2K start state...
Finite Automata - Deterministic Finite Automata
• Deterministic Finite Automaton (DFA) (or Finite State Machine)M = (K,Σ, δ, s, A), where
K is a finite set of statesΣ is an input alphabets ∈ K is a distinguished state called the start stateA ⊆ K is a set of accepting statesδ is a transition function, mapping K × Σ→ K
• State Diagram: Graphical representation of a DFA
• Transition table can be used to represent transitions
– Rows indexed by states
– Columns indexed by alphabet
– Contents are state transitioned to
– Example:0 1
0 5 11 5 22 3 33 4 44 5 55 5 5
• Configuration of DFA M is element of K × Σ∗
– Represents current state and remaining input
– Initial configuration of M on input w denoted (sM , w)
1
Finite Automata - Deterministic Finite Automata (2)
• Yields-in-one-step relation relates configurations to their immediate succes-sors
– Denoted `M (for DFA M)
– Let c ∈ Σ, w ∈ Σ∗, qi, qj ∈ K. Then
(qi, cw) `M (qj, w) iff ((qi, c), q2) ∈ δ– I.e., c is legal transition symbol from qi to qj
• Yields is reflexive, transitive closure of `M– Denoted `∗M– Given configurations Ci, Cj
Ci `∗M Cj indicates that M can transition between Ci and Cj in 0 ormore steps
• Computation by M is finite sequence of configurations C0, C1, ..., Cn, n ≥ 0,where
C0 is initial configurationCn is of the form (q, ε), q ∈ KM
C0 `M C1 `M ... `M Cn
• Given string w ∈ Σ∗
– M accepts w iff (s, w) `∗M (q, ε), q ∈ AM
– (q, ε) called accepting configuration
• Given string w ∈ Σ∗
– M rejects w iff (s, w) `∗M (q, ε), q /∈ AM
– (q, ε) called rejecting configuration
• Language accepted by M denoted L(M)
• Operation (summary)
1. DFA M begins operation in start state
2. Symbols of input string w read one-at-a-time
– Each causes a transition to some state in M
3. After all symbols have been consumed, w is accepted if M is in an acceptingstate; otherwise, w is rejected
2
Finite Automata - Deterministic Finite Automata (3)
• Theorem 5.1
– Statement: Every DFA M halts after |w| steps given input w
– Proof: (See p 60)
• State diagram variations
1. If several symbols transition between the same pair of states, represent asa single arc labeled with a comma-separated list of the symbols
2. If the majority of alphabet symbols cause a transition, represent those thatdo using set difference
– E.g., Σ− {c1, c2, ..., cn}3. Dead state
– Dead state is rejecting state with no transitions to other states
– Usually denoted d
– Can be eliminated from state diagrams
∗ If no labeled transition from a state for a given symbol, assume tran-sition is to dead state
3
Finite Automata - Designing DFAs
• Need to id 2 things about strings w to be input to a DFA:
1. Id properties of prefixes of w that affect the result
– These properties translate into states
2. Id categories of strings
– Many strings will drive DFA to a particular state, which then leads toa fixed result
– Goal is to id these categories of strings
– They will generate a state, which can be named for the cluster for read-ability
4
Finite Automata - Nondeterminism
• Given an algorithm A and a set of inputs
– A is deterministic if it always performs the same computation on theseinputs
– Otherwise, A is nondeterministic
• A nondeterministic algorithm is one that can ”guess” what to do next
• For a given problem, a nondeterministic algorithm may produce different solu-tions at different times
• Nondeterministic algorithms are sometimes easier to design than deterministicones
5
Finite Automata - Nondeterministic FAs
• Nondeterministic Finite Automata (NFAs) are FAs in which transitionsare ”relaxed” (See below)
• NFA M = (K,Σ,∆, s, A) where
K is a finite set of statesΣ is an input alphabets ∈ K is a distinguished state called the start stateA ⊆ K is a set of accepting states∆ is a transition relation, a finite subset of (K × (Σ ∪ {ε}))×K
• M accepts w iff one of its computations accepts w
• M rejects w iff none of its computations accepts w
• Language accepted by M denoted L(M)
• NFAs differ from DFAs as follows:
– DFAs are so-called because each symbol results in exactly one transitionfrom a given state
– In NFAs
1. Multiple transitions may occur from a given state for a single symbol
2. Transitions may consume no input symbol
∗ These called ε-transitions
3. They correspond to guesses by M
4. There may be no transition from a given state for a symbol
∗ If input still remains, this results in rejection
6
Finite Automata - Nondeterministic FAs (2)
• Computation of an NFA can be considered from 2 perspectives:
1. As tree search
– Each node of tree represents a legal configuration
– Children represent configurations reachable in one step from the parentconfiguration
– Leaves represent configurations in which the input string has been con-sumed
∗ If any leaf represents an accepting state, the string is accepted
2. As parallel processing
– From a state, all transitions executed in parallel
– NFA moves between sets of states
∗ Given a set of states, the next set consists of all those states reachablefrom those in the initial set, based on the current input symbol
– If the final set contains an accepting state, the input string is accepted
• NFAs useful in following situations:
1. Languages that require complex DFAs
– ε-transitions often enable a much simpler NFA
2. Unions of languages
– Build a DFA for each language
– NFA constructed from a start state with ε-transitions to the start statesof each participating DFA
3. Pattern/substring matching
– Have a set of states that read the prefix
– Have an ε-transition to the pattern
– Have a set of states that read the remainder
4. Creating complex DFAs
– NFAS usually easier to create
– Then, convert to DFA (see later notes)
7
Finite Automata - Analyzing NFAs
• Given an NFA M, want to identify L(M)
• Can do so using 2 approaches discussed previously:
1. Perform depth-first search of all paths through search tree
2. Trace all paths in parallel
• Following discussion deals with parallel trace
• Define function eps as mapping from Km → P(Km)
– I.e., eps(q) maps from state q to all states reachable from q via ε-transitions:
eps(q) = {p ∈ K : (q, w) `∗M (p, w)}– To compute eps(q):
stateSet eps (state q, delta)
{
stateSet result = {q};
push(q, stack);
while (!empty(stack)) {
p = pop(stack);
for (each ((p, EPSILON), r) in delta)
if (r !in result) {
result = result + {r}; //’+’ is union operator
push(r, stack);
}
return result;
}
8
Finite Automata - Analyzing NFAs (2)
• To simulate parallel computation of an NFA:
boolean nfaSimulate (MFA M, string w)
{
stateSet currentState = eps(s);
while (length(w) > 0) {
stateSet nextState = NULL;
char c = getNextSymbol(w);
for (each state q in currentState)
for (each ((q, c), p) in M.delta)
nextState = nextState + eps(p);
}
for (each state q in currentState)
if (q in M.A)
return TRUE;
return FALSE;
}
9
Finite Automata - Nondeterminism and Implementation
• 2 possible approaches to implementing nondeterminism:
1. choose (action1;;
action2;;
...
actionn;;)
– Each action produces a solution or FALSE
– Semantics:
∗ If any action produces a solution, choose will halt and return a solu-tion
∗ Otherwise,
· If all actions halt, choose halts (returning FALSE)
· If any action fails to halt, choose will fail to halt
– Need a methodical way of selecting actions
2. choose (x from S: P(x))
– S is a set of values (finite or infinite)
– Semantics:
∗ If P (x) produces a solution for any x, choose will halt and return asolution
∗ Otherwise,
· If P (x) halts for all x, choose halts (returning FALSE)
· If any computation of P (x) fails to halt, choose will fail to halt
– Need a methodical way of selecting x ∈ S
• Can associate probabilities with choices
– Options with higher probabilities are more likely to be chosen than thosewith lower probabilities
– Useful when have a priori knowledge of problem domain
10
Finite Automata - Equivalence of DFAS and NFAs
• Theorem 5.2
– Statement: For every DFA that accepts L, there is an equivalent NFA thataccepts L
– Proof: (See p74)
• Theorem 5.3
– Statement: For every NFA that accepts L, there is an equivalent DFA thataccepts L
– Proof: (By construction)
∗ Given: NFA M = (K,Σ,∆, s, A)
∗ Create M ′ = (K ′,Σ, δ′, s′, A′), where
K ′ contains 1 state for each element of Ps′ = eps(s)A′ = {Q ⊆ K ′ : Q ∩ A 6= ∅}δ′(Q, c) =
⋃{eps(p) : ∃q ∈ Q((q, c, p) ∈ ∆)}∗ Note:
1. In most cases, only a small subset of K ′ are actually needed
2. Accepting states of A′ are those that contain states from A
11
Finite Automata - Equivalence of DFAS and NFAs (2)
– Construction algorithm:
DFA nfaToDFA (NFA M)
{
for (each state q in M.K)
compute eps(q, M.delta);
stateSet s’ = eps(s, delta)
//Compute delta’
setOfStateSets activeStates = {s’};
delta’ = NULL;
while (there is a Q in activeStates that has not been processed)
for (each symbol c in M.sigma) {
stateSet newState = NULL;
for (each state q in Q) {
for (each state p where ((q, c), p) is in Delta)
newState = newState + eps(p, delta);
delta’ = delta’ + {((Q, c), newState)};
if (newState !in activeStates)
activeStates = activeStates + {newState};
}
}
K’ = activeStates;
A’ = {Q in K’: Q % M.A <> NULL}; //’%’ is intersection operator
return M = (K’, sigma, delta’, s’, A’);
}
12
Finite Automata - Implementation
• Can implement FAs in several ways
1. In hardware
2. Hardcoded
3. Simulation via an interpreter
– Most general approach, discussed below
– DFAs and NFAs treated separately
• Implementing DFAs
– Algorithm:
boolean dfaSim (DFA M, string w)
{
symbol c;
state st = M.s;
while (length(w) > 0) {
c = getNextSymbol(w);
st = M.delta(st, c);
}
if (st in M.A)
return TRUE;
else
return FALSE;
}
– Run time ∈ O(|w|), assuming transition lookup ∈ O(1)
• Implementing NFAs
1. Convert NFA to DFA
– Then run above simulator on resulting DFA
– Conversion is most expensive operation:O(2k), where k is number of states
2. Simulate parallel execution of NFA
– See earlier algorithm
– Only generate states as they are needed, rather than generating entireDFA
13
Finite Automata - Generating a Minimal DFA for a Regular Language
• From an implementational point of view, want the smallest DFA that canaccept a given language
• Minimal DFA M is one such that no other DFA M ’, where L(m) = L(M ′)has fewer states than M
• Questions with important ramifications:
1. For a given language, can a minimal DFA be found?
2. Is a minimal DFA unique?
3. How can it be determined whether a given DFA is minimal?
4. Given a DFA, how is a minimal equivalent constructed?
• Construction based on concept of state clusters
• Indistinguishable strings: Given strings x, y
– x, y are indistinguishable wrt language L iff
∀z ∈ Σ∗(either both xz and yz ∈ L, or neither xz and yz ∈ L)
– Denote indistinguishability as x ≈L y
– Strings that are not indistinguishable are distinguishable
– ≈L is an equivalence relation:
1. Reflexive: x ≈L x
2. Symmetric: x ≈L y → y ≈L x
3. Transitive: x ≈L y, y ≈L z → x ≈L z
• Equivalence classes
– Equivalence classes denoted using square brackets:
∗ [n], where n represents a numbered class
∗ [s], where s represents a string in the class
∗ [logical expression], which describes the class
– Every string in L belongs to exactly one equivalence class
– To id equivalence classes
1. Generate strings from shortest to longest, starting with ε
2. For each newly generated string, ask whether it belongs to an existingEC, or whether a new EC must be created for it
3. Continue until a pattern emerges
14
Finite Automata - Generating a Minimal DFA for a Regular Language (2)
– Note that
1. ε belongs to some EC, which corresponds to the start state of the minimalDFA
2. No EC can contain both strings ∈ L and strings 6∈ L3. More than 1 EC may contain strings that are in L
4. Exactly 1 EC corresponds to the dead state
• Containment: State q of DFA M contains string w if M is in state q afterreading w
15
Finite Automata - Generating a Minimal DFA for a Regular Language (3)
• Theorem 5.4
– Statement: ≈L imposes a lower bound on the minimal number of states ofa DFA for L. Let L be a regular language and M be a DFA that acceptsL. The number of states in M ≥ the number of ECs of ≈L.
– Proof: See p 86.
• Theorem 5.5
– Statement: There exists a unique minimal DFA for every regular language.Let L be a regular language over alphabet Σ. There is a DFA M thataccepts L and has exactly n states, where n = the ECs of ≈L. Any otherDFA that accepts L must either have more states than M , or n states thatare equivalent to those of M . The number of states in M ≥ the number ofECs of ≈L.
– Proof: By construction. Create M = (K,Σ, A, δ) as follows.
1. Generate n ECs of M
2. Create one state for each EC
3. Set K to this set of states
4. S = [ε]
5. A = {[x] : x ∈ L}6. δ([x], a) = [xa]
∗ Must prove the following:
1. K is finite
2. δ is a function
3. L = L(M)
4. M is minimal
5. No other DFA with n states accepts L
∗ Proof of above: See pp 87 - 88
• Theorem 5.6: Myhill-Nerode Theorem
– Statement: A language is regular iff the number of ECs of ≈L is finite
– Proof: See p 90
16
Finite Automata - Minimizing an Existing DFA
• Previous discussion was concerned with creating a minimal DFA from scratch
• Here the concern is finding a minimal DFA equivalent to an existing DFA
• Two approaches:
1. Iteratively collapse redundant states until cannot collapse any further
2. Partition states into accepting and rejecting
– Iteratively subdivide states based on input characters until no more dis-tinctions can be made
– This is approach described below
• Algorithm based on state equivalence
– States p and q are equivalent iff, for all strings w ∈ Σ∗, either w drivesM to accepting states from both p and q, or it drives M to rejecting statesfrom both p and q
– Denoted p ≡ q
– Series of equivalence relations denoted ≡n, where n ≥ 0
∗ p ≡n q iff p and q produce the same outcome for all strings of length n
∗ Formally,
· p ≡0 q iff p and q are both accepting or rejecting
· For n ≥ 1, p ≡n q iff p ≡n−1 q and ∀a ∈ Σ(δ(p, a) ≡n−1 δ(q, a))
• Algorithm overview
1. Algorithm starts by constructing ≡0
– This partitions K into 2 sets
2. Then create ≡1, ≡2, ...
– For each case, examine pairs of states in each class wrt every element inΣ
– For any symbol that drives states to 2 different results, partition statesinto 2 sets
3. Halt when no differences id’d
17
Finite Automata - Minimizing an Existing DFA (2)
• Algorithm
DFA minDFA (DFA M)
{
classes = {M.A, M.K - M.A};
do {
newClasses = NULL;
for (each EC e in classes)
if (e contains > 1 state) {
for (each state q in e)
for (each c in sigma)
determine which set of classes q transitions to on c;
for (each state p in e - q)
for (each c in M.sigma)
if (p transitions to a different class than q on c)
if (new state already created for this
transition on this pass)
add p to new class;
else {
create new class for p;
insert new class into newClasses;
}
}
classes = newClasses;
} while (newClasses <> NULL);
for (each q in M.K) //construct deltaMprime
for (each c in M.sigma)
if (M.delta(q, c) = p)
deltaMprime(\q], c) = [p];
return (classes, sigma, deltaMprime[M.s],
{[q: elements of q in M.A]});
18
Finite Automata - Minimizing an Existing DFA (3)
• Alternative Algorithm
– Every pair of states has 2 associated structures:
1. D[i, j]: Indicates whether qi distinguishable (1) from q2 or not (0)
2. S[i, j]: Holds a set of indices whose distinguishability depends on thatof qi and qj∗ Consider
∗ If qi and qj are known to be distinguishable when qm and qn areexamined, then qm and qn are distinguishable
∗ If qi and qj are not distinguishable when qm and qn are examined,then qm and qn are added to S[i, j] because if later qi and qj are foundto be distinguishable, then so should qm and qn
19
Finite Automata - Minimizing an Existing DFA (4)
DFA minDFA2 (DFA M)
{
for (every state pair qi, qj, i < j) {
D[i, j] = 0;
S[i, j] = NULL;
}
for (every i, j where i < j)
if (((qi in M.A) and (qj in M.K - M.A)) OR
((qj in M.A) and (qi in M.K - M.A)))
D[i, j] = 1;
for (every i, j where i < j)
if (D[i, j] == 0) {
for (each c in M.sigma)
if ((((qi, c) qm) AND ((qj, c) qn) in M.delta) AND
((D[m, n] = 1) OR (D[n, m] = 1))
dist(i, j);
else
for (each c in M.sigma) {
qm = M.delta(qi, c);
qn = M.delta(qj, c);
if ((m < n) AND [i, j] != [m, n]
S[m, n] = S[m, n] + [i, j];
else if ((m > n) AND [i, j] != [m, n]
S[m, n] = S[n, m] + [i, j];
}
}
}
void dist(i, j)
{
D[i, j] = 1;
for (each [m, n] in S[i, j])
dist(m, n);
}
20
Finite Automata - Canonical Form
• Canonical form is a standard representation
– If 2 objects are equivalent, they will have the same canonical form
– Advantage of canonical forms is that they can be used to test 2 objects forequivalence
• Minimization algorithm can be used to create a canonical form for a DFA
– A minimal DFA for language L(M) is unique, except possibly for statenames
– If normalize state names, have a canonical form
• Algorithm
DFA createCF (FA M)
{
M’ = nfaToDFA (M); //convert NFA to equivalent DFA
M’’ = minDFA(M’); //convert to equivalent minimal DFA
q0 = M’’.s;
named = {s};
push(s, stateStack);
k = 1;
while(notEmpty(stateStack)) {
q = pop(stateStack);
for (each c in M’’.sigma) {
p = M’’.delta(q, c);
if ((p not NULL) AND (p not named)) {
rename p as qk;
named = named + p;
push(p);
k++;
}
}
}
return M’’;
}
21