Download - Notes to Closure Properties

Transcript
Page 1: Notes to Closure Properties

Notes to Closure Properties

Simplification of the automata design

L.∅ = ∅.L = ∅{λ}.L = L.{λ} = L

(L∗)∗ = L∗

(L1 ∪ L2)∗ = L∗1(L2.L∗

1)∗ = L∗2(L1.L∗

2)∗

(L1.L2)R = LR2 .LR

1∂w (L1 ∪ L2) = ∂w (L1) ∪ ∂w (L2)∂w (Σ∗ − L) = Σ∗ − ∂w L

Proof of non–regularityL = {w |w ∈ {0, 1}∗, |w |1 = |w |2} is not regular sinceL ∩ {0i 1j |i , j ≥ 0} = {0i 1i |i ≥ 0} is not regular (pumping lemma).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 75 / 75 - 95

Page 2: Notes to Closure Properties

Regular Expressions (RegE)

Definition 4.1 (Regular Expression (RegE), value of a RegE L(α))Regular expressions α, β ∈ RegE (Σ) over a finite non–empty alphabetΣ = {x1, x2, . . . , xn} and their value L(α) is defined by induction:

Basis:

expression α for value L(α) ≡ [α]∅ empty expression L(∅) = {} ≡ ∅λ empty string L(λ) = {λ}a a ∈ Σ L(a) = {a}.

Induction:expression value remarkα + β L(α + β) = L(α) ∪ L(β)αβ L(αβ) = L(α)L(β) . may be usedα∗ L(α∗) = L(α)∗

(α) L((α)) = L(α) brackets do not change the value.The class of regular expressions over Σ: RegE (Σ) is the smallest class closedunder operations above.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 76 / 75 - 95

Page 3: Notes to Closure Properties

Examples, Precedence

Example 4.1 (Regular Expressions)The language of alternating 0’s and 1’s may be written:

either (01)∗ + (10)∗ + 1(01)∗ + 0(10)∗

or (λ + 1)(01)∗(λ + 0).The language L((0∗10∗10∗1)∗0∗) = {w |w ∈ {0, 1}∗, |w |1 = 3k, k ≥ 0}.

Definition 4.2 (Precedence)The star ∗is the operator with highest precedence, then concatenation ., thelowest precedence has the union +.

Theorem 4.1 (RegE and DFA !Kleene theorem (a variant))Any language recognizable by a DFA can be expressed by a regular expression.Any language of a regular expression can be recognized by a λ-NFA(therefore also a DFA).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 77 / 75 - 95

Page 4: Notes to Closure Properties

Example

1 2

1

0

0,1

R(2)12 = 1∗0(0 + 1)∗

R(0)11 λ + 1 =

R(0)12 0 =

R(0)21 ∅ =

R(0)22 (λ + 0 + 1) =

R(1)11 λ + 1 + (λ + 1)(λ + 1)∗(λ + 1) =1∗

R(1)12 0 + (λ + 1)(λ + 1)∗0 =1∗0

R(1)21 ∅ + ∅(λ + 1)∗(λ + 1) =∅

R(1)22 λ + 0 + 1 + ∅(λ + 1)∗0 =λ + 0 + 1

R(2)11 1∗ + 1∗0(λ + 0 + 1)∗∅ =1∗

R(2)12 1∗0 + 1∗0(λ + 0 + 1)∗(λ + 0 + 1) =1∗0(0 + 1)∗

R(2)21 ∅ + (λ + 0 + 1)(λ + 0 + 1)∗∅ =∅

R(2)22 λ + 0 + 1 + (λ + 0 + 1)(λ + 0 + 1)∗(λ + 0 + 1)=(0 + 1)∗

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 78 / 75 - 95

Page 5: Notes to Closure Properties

From a DFA to RegE

From a DFA to RegELet us have a DFA A, QA = {1, . . . , n} with n states.Let R(k)

ij be a regular expression, L(R (k)ij ) = {w |δ∗

≤k(i , w) = j} the set ofwords transferring the state i into j in A where no state with an index higherthan k is on the path.We iteratively construct R (k)

ij pro k = 0, . . . , n.k = 0, i �= j : R (0)

ij = a1 + a2 + . . . + am where a1, a2, . . . , am are symbols onedges i into j (or R (0)

ij = ∅ or R(0)ij = a for m = 0, 1).

k = 0, i = j : loops, R (0)ii = λ + a1 + a2 + . . . + am where a1, a2, . . . , am are

symbols on loops in i .

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 79 / 75 - 95

Page 6: Notes to Closure Properties

Induction. We have ∀i , j ∈ Q R (k)ij . We construct R(k+1)

ij .

i k+1 jR(k)

i,(k+1) R(k)(k+1),j

R(k)i,j

R(k)(k+1),(k+1)

R(k+1)ij = R(k)

ij + R(k)i(k+1)(R

(k)(k+1)(k+1))∗R(k)

(k+1)j

Paths from i into j not meeting (k + 1) are already in R (k)ij .

Paths from i into j through (k + 1) with possible loops can be expressedR(k)

i(k+1)(R(k)(k+1)(k+1))∗R(k)

(k+1)j .

Finally RegE = ⊕j∈FAR(n)1j the union over all accepting states j .

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 80 / 75 - 95

Page 7: Notes to Closure Properties

DFA to RegE by Successive State Elimination

Previous method may generate up to 4n symbols.Following algorithm sometimes avoids duplicity.We allow regular expressions to anotate the graph (a transformation of theautomaton).

State s selected for elimination

q1 p1

s

q2

qk

pm

R11

R1mR21

R2m

R31

R3m

Q1

Q2

Qk

P1

Pm

S

After s is eliminated.

q1 p1

q2

qk

pm

R11 + Q1S∗P1

R1m + Q1S∗PmR21 + Q2S∗P1

R2m + Q2S∗Pm

Rk1 + QkS∗P1

Rkm + QkS∗Pm

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 81 / 75 - 95

Page 8: Notes to Closure Properties

A RegE from a DFA

For every accepting state q ∈ F we eliminate all states p ∈ Q \ {q, q0}.

For q �= q0 we take

RegE (q) = (R + SU∗T )∗SU∗.

RS

T

U

For q = q0 we take

RegE (q) = R∗.

R

And take the union (addition) over all accepting states:RegE (DFA) = ⊕q∈F RegE (q).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 82 / 75 - 95

Page 9: Notes to Closure Properties

ExampleExample 4.2DFA that accepts 1 at the second last or the third-last position.

A B C D

0,1

1 0,1 0,1 the original automaton

A B C D

0+1

1 0+1 0+1 We replace strings by a RegE.

A C D

0+1

1(0+1) 0+1 We eliminate B.

A D

0+1

1(0+1)(0+1) We eliminate C .

We get RegE: (0 + 1)∗1(0 + 1) + (0 + 1)∗1(0 + 1)(0 + 1).

[Elimination Order]We start by non-accepting nor initial nodes q /∈ F , q �= q0.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 83 / 75 - 95

Page 10: Notes to Closure Properties

From a RegE to λ–NFA

From a RegE to λ–NFA

By induction by the structure of R. Ineach step we construct λ-NFA E thatrecognizes the same language L(R) =L(E ) with three additional properties:

1. Exactly one accepting state.2. No edges into the inital state.3. No edges from the accepting state.

Basis:

λ Empty string λ

Empty set ∅a A single string a

INDUCTION:

Addition R + S: R

S

λ

λ

λ

λ

Concatenation RS: R Sλ

Iteration R∗: Rλ λ

λ

λAutomata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 84 / 75 - 95

Page 11: Notes to Closure Properties

Pattern search in the text

Static text: we create indexes rather than using RegE.RegE are useful in the dynamic text (like news).

Example 4.3 (Search for streets in addresses on the web)Street identification Streen|St\.|Avenue|Ave\.|Road|Rd\the name before ’[A-Z][a-z]*( [A-Z][a-z]*)*’house number [0-9]+[A-Z]?

all together ’[0-9]+[A-Z]? [A-Z][a-z]*( [A-Z][a-z]*)*Streen|St\.|Avenue|Ave\.|Road|Rd\. ’

We are missing:Bouleward, Place, WayStreets without any identifier (almost all Czech streets)Street names with numbers.. . .

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 85 / 75 - 95

Page 12: Notes to Closure Properties

Converting Among Representations

Converting NFA to DFAλ closure in O(n3). Searchn states multiplied by n2

arcs for λ transitions.Subset construction, DFAwith possibly 2n states. Foreach state, O(n3) time tocompute transitions.

λ−NFA NFA

RegE DFA

O(n32n)

O(n32n)O(n)

O(n34n)

O(n)

O(n)

Converting DFA to NFAJust modify transition table by putting set-brackets around states and addingcolumn for λ in the case of λ−NFA.

Automaton to Regular Expression ConversionO(n34n)

RegE to Automaton Conversionλ−NFA in the time O(n).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 86 / 75 - 95

Page 13: Notes to Closure Properties

String Substitution, (String) Homomorphism

Definition 4.3 (String Substitution, (String) Homomorphism)We have a finite alphabet Σ. For each x ∈ Σ we have σ(x) a language over thealphabet Yx . Further, we define:

σ(λ) = {λ}σ(u.v) = σ(u).σ(v)

The mapping σ : Σ∗ → P(Y ∗) where Y =�

x∈Σ Yx is substitution.σ(L) =

�w∈L σ(w)

e–free, λ–free substitution is a substitution where none σ(x) contains λ.For w = a1 . . . an ∈ Σn σ(w) = σ(a1) . . . σ(an).

Example 4.4 (substitution)σ(0) = {aibj , i , j ≥ 0}, σ(1) = {cd}σ(010) = {aibjcdakbl , i , j , k, l ≥ 0}

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 87 / 75 - 95

Page 14: Notes to Closure Properties

(String) Homomorphism

Definition 4.4 ((String) Homomorphism)Homomorphism h is a special case of a substitution where h(x) = wx ∀x ∈ Σ. If∀x : wx �= λ is is e–free (λ–free) homomorphism.Inverse homomorphism h−1(L) = {w |h(w) ∈ L}.

Example 4.5 (homomorphism)The function h defined by: h(0) = ab, and h(1) = λ is a homomorphism. Forexample, h(0011) = abab.For L = 10∗1 is h(L) = (ab)∗.

Theorem (Closure under homomorphism)If language L and all ∀x ∈ Σ σ(x) are regular, so is also σ(L), h(L), h−1(L).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 88 / 75 - 95

Page 15: Notes to Closure Properties

Homomorphism preserves regularity

Theorem 4.2If L is a regular language over alphabet Σ, and h is a homomorphism on Σ, thenh(L) is also regular.

Proof.Let L = L(R) for some regular expression R. The proof is done by structuralinduction on sub-expressions E of R: we claim L(h(E )) = h(L(E )).

Basis: h({λ}) = λ, h(∅) = ∅. If E = a then L(E ) = {a}, soh(L(E )) = {h(a)}. Thus, L(h(E )) = {h(a)}.Induction:

Union: L(h(F + G)) = L(h(F ) + h(G)) = L(h(F )) ∪ L(h(G)) andh(L(F + G)) = h(L(F ) ∪ L(G)) = h(L(F )) ∪ h(L(G)). Right sides are equalfrom inductive hypothesis therefore left sides also equal.concatenation, closure proofs are similar.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 89 / 75 - 95

Page 16: Notes to Closure Properties

Inverse Homomorphism

Definition 4.5 (Inversehomomorphism)Suppose h is a homomorphism fromsome alphabet Σ to strings inanother alphabet T . Then h−1(L) ’hinverse of L’ is the set of strings win Σ∗ such that h(w) is in L.

Example 4.6Let L = (00 + 1)∗, h(a) = 01 andh(b) = 10.We claim h−1(L) = (ba)∗.

Proof: h((ba)∗) ∈ L is easy to see.Other w generates isolated 0 (4 casesto consider).

L

h(L)h

h

L

h−1(L)

A homomorphism applied in theforward and inverse direction.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 90 / 75 - 95

Page 17: Notes to Closure Properties

Inverse Homomorphism DFATheorem 4.3If h is a homomorphism from alphabet Σ to alphabet T , and L is a regularlanguage over T , then h−1(L) is also a regular language.

Proof.The proof starts with a DFA A for L. Weconstruct a DFA for h−1(L).

For A = (Q, T , δ, q0, F ) we defineB(Q, Σ, δB , q0, F ) whereδB(q, a) = δ∗(q, h(a)) (δ∗ operates onstrings).By induction on |w |,δ∗

B(q0, w) = δ∗(q0, h(w)).Therefore, B accepts exactly thosestrings w that are in h−1(L).

StartA

Input w

Input h(w) to A

h

Accept/reject

The DFA for h−1(L) ap-plies h to its input, andthen simulates the DFAfo L

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 91 / 75 - 95

Page 18: Notes to Closure Properties

Visit every state exampleExample 4.7Suppose A = (Q, Σ, δ, q0, F ) is an DFA. The language L of all strings w in Σ∗

such that δ∗(q0, w) is in F and for every state q ∈ Q there is some prefix xq of wsuch that δ∗(q0, xq) = q.This language is regular.

M = L(A) the language accepted by DFA A in the usual way.T We define a new alphabet T of triples {[paq]; p, q ∈ Q, a ∈ Σ, δ(p, a) = q}.h We define the homomorphism h([paq]) = a for all p, q, a.

L1 Language L1 = h−1(M) is regular since M is regular (DFA and inversehomomorphism).h−1(101) includes 23 = 8 strings, like[p1p][q0q][p1p] ∈ {[p1p], [q1q]}{[p0q], [q0q]}{[p1p], [q1q]}.We construct L from L1 (next slide).

p q

1

0 0,1

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 92 / 75 - 95

Page 19: Notes to Closure Properties

L2 Enforce start at q0. DefineE1 =

�a∈Σ,q∈Q{[q0aq]} =

E1 = {[q0a1q0], [q0a2q1], . . . , [q0amqn]}.Then, L2 = L1 ∩ L(E1.T ∗).

L3 Adjacent states must equal. Definenon-matching pairsE2 =

�q �=r ,p,q,r ,s∈Q,a,b∈Σ{[paq][rbs]}.

Define L3 = L2 − L(T ∗.E2.T ∗),L3 It ends in accepting state since we

started from M language of acceptingcomputations on the DFA A.

L4 All states. For each state q ∈ Q, defineEq be the regular expression that is thesum of all the symbols in T such that qappears in neither its first or lastposition. We substract L(E ∗

q ) from L3.L4 = L3 − �

q∈Q{E∗q }.

L Remove states, leave symbols.L = h(L4). We conclude L is regular.

In brief:M = L(A)

Inverse homomorphismL1 h−1(M) ⊆ {[qap]}∗

Intersection with a RLL2 + q0

Difference with a RLL3 + adjacent states equal

Difference with a RLL4 + all states on the path

HomomorphismL h([qap]) = a

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 93 / 75 - 95

Page 20: Notes to Closure Properties

Decision Properties of Regular Languages

Lemma (Testing Emptiness of Regular Languages)For finite automatons, it is a question of graph reachability of any final state fromthe initial one. Reachability is O(n2).

LemmaFor regular expression, we can convert it to λ−NFA in O(n) time and than checkreachability.

It can be done also by direct inspection:Basis: ∅ denotes empty language; λ and a are not empty.Induction:

R = R1 + R2 is empty iff both L(R1) and L(R2) are empty.R = R1R2 is empty iff either L(R1) or L(R2) is empty.R = R∗

1 is never empty, in includets λ.R = (R1) is empty iff R1 is empty, since they are the same language.

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 94 / 75 - 95

Page 21: Notes to Closure Properties

Testing Membership in a Regular Language

Given a string w ; |w | = n and a regular language L, is w ∈ L?DFA: Run automaton; if |w | = n, suitable representation, constant timetransitions, it is O(n).NFA with s states: running time O(ns2).Each input symbol can be processedby taking the previous set of states, which numbers at most s states.λ−NFA - first compute λ−closure. Then, for each symbol proceed it andcompute λ− closure of the result.For a regular expression of size s we convert it to an λ−NFA with at most 2sstates and then simulate, taking O(ns2).

Automata and Grammars Regular Expressions,Kleene Theorem, Subst., Homom. 4 August 9, 2019 95 / 75 - 95