of 21 /21
Notes to Closure Properties Simplication of the automata design L.= .L = {λ}.L = L.{λ} = L (L ) = L (L 1 L 2 ) = L 1 (L 2 .L 1 ) = L 2 (L 1 .L 2 ) (L 1 .L 2 ) R = L R 2 .L R 1 w (L 1 L 2 ) = w (L 1 ) w (L 2 ) w (Σ L) = Σ w L Proof of non–regularity L = {w |w {0, 1} , |w | 1 = |w | 2 } is not regular since L {0 i 1 j |i , j 0} = {0 i 1 i |i 0} is not regular (pumping lemma). Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 75 / 75 - 95
• Author

others
• Category

## Documents

• view

0

0

Embed Size (px)

### Transcript of Notes to Closure Properties

L.∅ = ∅.L = ∅ {λ}.L = L.{λ} = L
(L∗)∗ = L∗
(L1.L2)R = LR 2 .LR
1 ∂w (L1 ∪ L2) = ∂w (L1) ∪ ∂w (L2) ∂w (Σ∗ − L) = Σ∗ − ∂w L
Proof of non–regularity L = {w |w ∈ {0, 1}∗, |w |1 = |w |2} is not regular since L ∩ {0i 1j |i , j ≥ 0} = {0i 1i |i ≥ 0} is not regular (pumping lemma).
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 75 / 75 - 95
Regular Expressions (RegE)
Definition 4.1 (Regular Expression (RegE), value of a RegE L(α)) Regular expressions α, β ∈ RegE (Σ) over a finite non–empty alphabet Σ = {x1, x2, . . . , xn} and their value L(α) is defined by induction:
Basis:
expression α for value L(α) ≡ [α] ∅ empty expression L(∅) = {} ≡ ∅ λ empty string L(λ) = {λ} a a ∈ Σ L(a) = {a}.
Induction: expression value remark α + β L(α + β) = L(α) ∪ L(β) αβ L(αβ) = L(α)L(β) . may be used α∗ L(α∗) = L(α)∗
(α) L((α)) = L(α) brackets do not change the value. The class of regular expressions over Σ: RegE (Σ) is the smallest class closed under operations above.
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 76 / 75 - 95
Examples, Precedence
Example 4.1 (Regular Expressions) The language of alternating 0’s and 1’s may be written:
either (01)∗ + (10)∗ + 1(01)∗ + 0(10)∗
or (λ + 1)(01)∗(λ + 0). The language L((0∗10∗10∗1)∗0∗) = {w |w ∈ {0, 1}∗, |w |1 = 3k, k ≥ 0}.
Definition 4.2 (Precedence) The star ∗is the operator with highest precedence, then concatenation ., the lowest precedence has the union +.
Theorem 4.1 (RegE and DFA !Kleene theorem (a variant)) Any language recognizable by a DFA can be expressed by a regular expression. Any language of a regular expression can be recognized by a λ-NFA (therefore also a DFA).
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 77 / 75 - 95
Example
R(0) 11 λ + 1 =
R(1) 11 λ + 1 + (λ + 1)(λ + 1)∗(λ + 1) =1∗
R(1) 12 0 + (λ + 1)(λ + 1)∗0 =1∗0
R(1) 21 ∅ + ∅(λ + 1)∗(λ + 1) =∅
R(1) 22 λ + 0 + 1 + ∅(λ + 1)∗0 =λ + 0 + 1
R(2) 11 1∗ + 1∗0(λ + 0 + 1)∗∅ =1∗
R(2) 12 1∗0 + 1∗0(λ + 0 + 1)∗(λ + 0 + 1) =1∗0(0 + 1)∗
R(2) 21 ∅ + (λ + 0 + 1)(λ + 0 + 1)∗∅ =∅
R(2) 22 λ + 0 + 1 + (λ + 0 + 1)(λ + 0 + 1)∗(λ + 0 + 1)=(0 + 1)∗
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 78 / 75 - 95
From a DFA to RegE
From a DFA to RegE Let us have a DFA A, QA = {1, . . . , n} with n states. Let R(k)
ij be a regular expression, L(R (k) ij ) = {w |δ∗
≤k(i , w) = j} the set of words transferring the state i into j in A where no state with an index higher than k is on the path. We iteratively construct R (k)
ij pro k = 0, . . . , n. k = 0, i = j : R (0)
ij = a1 + a2 + . . . + am where a1, a2, . . . , am are symbols on edges i into j (or R (0)
ij = ∅ or R(0) ij = a for m = 0, 1).
k = 0, i = j : loops, R (0) ii = λ + a1 + a2 + . . . + am where a1, a2, . . . , am are
symbols on loops in i .
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 79 / 75 - 95
Induction. We have ∀i , j ∈ Q R (k) ij . We construct R(k+1)
ij .
R(k) i,j
(k+1)j
Paths from i into j not meeting (k + 1) are already in R (k) ij .
Paths from i into j through (k + 1) with possible loops can be expressed R(k)
i(k+1)(R (k) (k+1)(k+1))∗R(k)
(k+1)j .
Finally RegE = ⊕j∈FAR(n) 1j the union over all accepting states j .
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 80 / 75 - 95
DFA to RegE by Successive State Elimination
Previous method may generate up to 4n symbols. Following algorithm sometimes avoids duplicity. We allow regular expressions to anotate the graph (a transformation of the automaton).
State s selected for elimination
q1 p1
R2m + Q2S∗Pm
Rk1 + QkS∗P1
Rkm + QkS∗Pm
A RegE from a DFA
For every accepting state q ∈ F we eliminate all states p ∈ Q \ {q, q0}.
For q = q0 we take
RegE (q) = (R + SU∗T )∗SU∗.
R S
RegE (q) = R∗.
R
And take the union (addition) over all accepting states: RegE (DFA) = ⊕q∈F RegE (q).
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 82 / 75 - 95
Example Example 4.2 DFA that accepts 1 at the second last or the third-last position.
A B C D
A B C D
0+1
1 0+1 0+1 We replace strings by a RegE.
A C D
A D
0+1
1(0+1)(0+1) We eliminate C .
We get RegE: (0 + 1)∗1(0 + 1) + (0 + 1)∗1(0 + 1)(0 + 1).
[Elimination Order] We start by non-accepting nor initial nodes q /∈ F , q = q0.
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 83 / 75 - 95
From a RegE to λ–NFA
From a RegE to λ–NFA
By induction by the structure of R. In each step we construct λ-NFA E that recognizes the same language L(R) = L(E ) with three additional properties:
1. Exactly one accepting state. 2. No edges into the inital state. 3. No edges from the accepting state.
Basis:
INDUCTION:
λ Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 84 / 75 - 95
Pattern search in the text
Static text: we create indexes rather than using RegE. RegE are useful in the dynamic text (like news).
Example 4.3 (Search for streets in addresses on the web) Street identification Streen|St\.|Avenue|Ave\.|Road|Rd\ the name before ’[A-Z][a-z]*( [A-Z][a-z]*)*’ house number [0-9]+[A-Z]?
all together ’[0-9]+[A-Z]? [A-Z][a-z]*( [A-Z][a-z]*)* Streen|St\.|Avenue|Ave\.|Road|Rd\. ’
We are missing: Bouleward, Place, Way Streets without any identifier (almost all Czech streets) Street names with numbers. . . .
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 85 / 75 - 95
Converting Among Representations
Converting NFA to DFA λ closure in O(n3). Search n states multiplied by n2
arcs for λ transitions. Subset construction, DFA with possibly 2n states. For each state, O(n3) time to compute transitions.
λ−NFA NFA
O(n34n)
O(n)
O(n)
Converting DFA to NFA Just modify transition table by putting set-brackets around states and adding column for λ in the case of λ−NFA.
Automaton to Regular Expression Conversion O(n34n)
RegE to Automaton Conversion λ−NFA in the time O(n).
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 86 / 75 - 95
String Substitution, (String) Homomorphism
Definition 4.3 (String Substitution, (String) Homomorphism) We have a finite alphabet Σ. For each x ∈ Σ we have σ(x) a language over the alphabet Yx . Further, we define:
σ(λ) = {λ} σ(u.v) = σ(u).σ(v)
x∈Σ Yx is substitution. σ(L) =
w∈L σ(w)
e–free, λ–free substitution is a substitution where none σ(x) contains λ. For w = a1 . . . an ∈ Σn σ(w) = σ(a1) . . . σ(an).
Example 4.4 (substitution) σ(0) = {aibj , i , j ≥ 0}, σ(1) = {cd} σ(010) = {aibjcdakbl , i , j , k, l ≥ 0}
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 87 / 75 - 95
(String) Homomorphism
Definition 4.4 ((String) Homomorphism) Homomorphism h is a special case of a substitution where h(x) = wx ∀x ∈ Σ. If ∀x : wx = λ is is e–free (λ–free) homomorphism. Inverse homomorphism h−1(L) = {w |h(w) ∈ L}.
Example 4.5 (homomorphism) The function h defined by: h(0) = ab, and h(1) = λ is a homomorphism. For example, h(0011) = abab. For L = 10∗1 is h(L) = (ab)∗.
Theorem (Closure under homomorphism) If language L and all ∀x ∈ Σ σ(x) are regular, so is also σ(L), h(L), h−1(L).
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 88 / 75 - 95
Homomorphism preserves regularity
Theorem 4.2 If L is a regular language over alphabet Σ, and h is a homomorphism on Σ, then h(L) is also regular.
Proof. Let L = L(R) for some regular expression R. The proof is done by structural induction on sub-expressions E of R: we claim L(h(E )) = h(L(E )).
Basis: h({λ}) = λ, h(∅) = ∅. If E = a then L(E ) = {a}, so h(L(E )) = {h(a)}. Thus, L(h(E )) = {h(a)}. Induction:
Union: L(h(F + G)) = L(h(F ) + h(G)) = L(h(F )) ∪ L(h(G)) and h(L(F + G)) = h(L(F ) ∪ L(G)) = h(L(F )) ∪ h(L(G)). Right sides are equal from inductive hypothesis therefore left sides also equal. concatenation, closure proofs are similar.
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 89 / 75 - 95
Inverse Homomorphism
Definition 4.5 (Inverse homomorphism) Suppose h is a homomorphism from some alphabet Σ to strings in another alphabet T . Then h−1(L) ’h inverse of L’ is the set of strings w in Σ∗ such that h(w) is in L.
Example 4.6 Let L = (00 + 1)∗, h(a) = 01 and h(b) = 10. We claim h−1(L) = (ba)∗.
Proof: h((ba)∗) ∈ L is easy to see. Other w generates isolated 0 (4 cases to consider).
L
h(L)h
h
L
A homomorphism applied in the forward and inverse direction.
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 90 / 75 - 95
Inverse Homomorphism DFA Theorem 4.3 If h is a homomorphism from alphabet Σ to alphabet T , and L is a regular language over T , then h−1(L) is also a regular language.
Proof. The proof starts with a DFA A for L. We construct a DFA for h−1(L).
For A = (Q, T , δ, q0, F ) we define B(Q, Σ, δB , q0, F ) where δB(q, a) = δ∗(q, h(a)) (δ∗ operates on strings). By induction on |w |, δ∗
B(q0, w) = δ∗(q0, h(w)). Therefore, B accepts exactly those strings w that are in h−1(L).
Start A
Input w
h
Accept/reject
The DFA for h−1(L) ap- plies h to its input, and then simulates the DFA fo L
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 91 / 75 - 95
Visit every state example Example 4.7 Suppose A = (Q, Σ, δ, q0, F ) is an DFA. The language L of all strings w in Σ∗
such that δ∗(q0, w) is in F and for every state q ∈ Q there is some prefix xq of w such that δ∗(q0, xq) = q. This language is regular.
M = L(A) the language accepted by DFA A in the usual way. T We define a new alphabet T of triples {[paq]; p, q ∈ Q, a ∈ Σ, δ(p, a) = q}. h We define the homomorphism h([paq]) = a for all p, q, a.
L1 Language L1 = h−1(M) is regular since M is regular (DFA and inverse homomorphism). h−1(101) includes 23 = 8 strings, like [p1p][q0q][p1p] ∈ {[p1p], [q1q]}{[p0q], [q0q]}{[p1p], [q1q]}. We construct L from L1 (next slide).
p q
L2 Enforce start at q0. Define E1 =
a∈Σ,q∈Q{[q0aq]} =
L3 Adjacent states must equal. Define non-matching pairs E2 =
q =r ,p,q,r ,s∈Q,a,b∈Σ{[paq][rbs]}.
Define L3 = L2 − L(T ∗.E2.T ∗), L3 It ends in accepting state since we
started from M language of accepting computations on the DFA A.
L4 All states. For each state q ∈ Q, define Eq be the regular expression that is the sum of all the symbols in T such that q appears in neither its first or last position. We substract L(E ∗
q ) from L3. L4 = L3 −
q∈Q{E∗ q }.
L Remove states, leave symbols. L = h(L4). We conclude L is regular.
In brief: M = L(A)
Difference with a RL L3 + adjacent states equal
Difference with a RL L4 + all states on the path
Homomorphism L h([qap]) = a
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 93 / 75 - 95
Decision Properties of Regular Languages
Lemma (Testing Emptiness of Regular Languages) For finite automatons, it is a question of graph reachability of any final state from the initial one. Reachability is O(n2).
Lemma For regular expression, we can convert it to λ−NFA in O(n) time and than check reachability.
It can be done also by direct inspection: Basis: ∅ denotes empty language; λ and a are not empty. Induction:
R = R1 + R2 is empty iff both L(R1) and L(R2) are empty. R = R1R2 is empty iff either L(R1) or L(R2) is empty. R = R∗
1 is never empty, in includets λ. R = (R1) is empty iff R1 is empty, since they are the same language.
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 94 / 75 - 95
Testing Membership in a Regular Language
Given a string w ; |w | = n and a regular language L, is w ∈ L? DFA: Run automaton; if |w | = n, suitable representation, constant time transitions, it is O(n). NFA with s states: running time O(ns2).Each input symbol can be processed by taking the previous set of states, which numbers at most s states. λ−NFA - first compute λ−closure. Then, for each symbol proceed it and compute λ− closure of the result. For a regular expression of size s we convert it to an λ−NFA with at most 2s states and then simulate, taking O(ns2).
Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 95 / 75 - 95