Regular Grammars

18
Regular Grammars Formal definition of a regular expression. Languages associated with regular expressions. Introduction regular grammars. Regular language and homomorphism. The Chomsky Hierarchy

description

Regular Grammars. Formal definition of a regular expression. Languages associated with regular expressions. Introduction regular grammars. Regular language and homomorphism. The Chomsky Hierarchy. Regular Expression. The regular expressions over a set I are defined recursively by: - PowerPoint PPT Presentation

Transcript of Regular Grammars

Page 1: Regular Grammars

Regular Grammars

Formal definition of a regular expression. Languages associated with regular expressions. Introduction regular grammars. Regular language and homomorphism. The Chomsky Hierarchy

Page 2: Regular Grammars

Regular Expression

The regular expressions over a set I are defined recursively by:the symbol is a regular expression;∅the symbol λ is a regular expression;the symbol x is a regular expression whenever x I ;∈the symbols (AB), (A B), and A∪ * are regular expressions whenever A and B are regular expressions.

∅ represents the empty set, that is, the set with no strings;λ represents empty string;x represents the set {x} containing the string with one symbol x;(AB) represents the concatenation of the sets represented by A and by B;(A B) represents the union of the sets represented by A and by B;∪A* represents the Kleene closure of the set represented by A.

Page 3: Regular Grammars

Example

What are the strings in the regular sets specified by the regular expressions 10*, (10)*, 0 01, 0(0 1)∪ ∪ *, and (0*1)*?

Page 4: Regular Grammars

Example

Find a regular expression that specifies each of these sets:

(a) the set of bit strings with even length

(b) the set of bit strings ending with a 0 and not containing 11

a) The set of strings of two bits is specified by the regular expression (00 01 10 11). Consequently, the set of ∪ ∪ ∪strings with even length is specified by (00 01 10 ∪ ∪ ∪11)∗ .

b) It must be the concatenation of one or more strings where each string is either a 0 or a 10. It follows that the regular expression (0 10)∪ ∗ (0 10) specifies the set of bit strings ∪that do not contain 11 and end with a 0.

Page 5: Regular Grammars

a) symbol ;∅b) symbol λ;

c) symbol a whenever a I ;∈

Page 6: Regular Grammars

Construct a nondeterministic finite-state automaton that

recognizes the regular set 1∗ 01.∪

Page 7: Regular Grammars
Page 8: Regular Grammars

Languages associated with regular expression

Definition: The Language L(r) denoted by any regular expression r is defined by the following rules.1) ∅ is a regular expression denoting the empty set,2) λ is a regular expression denoting {λ },3) For every a ϵ∑, a is a regular expression denoting {a}

If r1 and r2 are regular expressions, then1) L(r1 + r2) = L(r1) U L(r2),2) L(r1.r2) = L(r1)L(r2),3) L((r1)) = L(r1),4) L(r1*) = (L(r1))*

Page 9: Regular Grammars

Example: Exhibit the language L(a*.(a + b)) in set notation.

Solution :

L(a*.(a + b)) = L(a*)L(a + b) (from L(r1.r2) = L(r1)L(r2))

= (L(a))*(L(a)U(L(b)) (from L(r1*)) = (L(r1))*)

= (L(a))*(L(a)U(L(b)) (from L(r1+r2)=L(r1) U L(r2))

But (L(a))*={ , a, aa, aaa, …..}

L(a) ={a} and L(b) ={b} L(a) U L(b) ={a,b}

L(a*.(a + b)) = { , a, aa, aaa, …..}{a,b}

= {a, b, aa, ab, aaa, aab,……}.

Page 10: Regular Grammars

Example: For ∑ = {a, b} , the expression r= (a + b) * (a + bb) is a regular expression. Write its language.Solution: (we can prove easily r is regular expression)r= (a + b) * (a + bb)L(r) = L((a + b) * (a + bb)) = L((a + b) *) L((a+bb)) = (L(a+b))* (L(a) U L(bb)) = (L(a) U L(b))* (L(a) U L(bb)) =((L(a))* U (L(b))*) (L(a) U L(bb)) But (L(a))*={a}*= { , a, aa, aaa, …..} (L(b))*={b}*= { , b, bb, bbb, …..} L(a) U L(bb) ={a, bb}So, L((a+b)*(a + bb))={ , a, aa, aaa….., b, bb, bbb,……}{a, bb} = {a, bb, aa, abb, …… ba, bbb, ……….}, In other words L(r) is the set of all strings on {a, b}, terminated by either a or bb.

Page 11: Regular Grammars

Example: write the language for the following expression;r= (aa)*(bb)*bSolution: L(r) = L((aa)*(bb)*b) = L((aa)*) L((bb)*) L(b) = (L(aa))* (L(bb))* L(b) = {aa}*{bb}*{b} = { , aa, aaaa, aaaaaa, ..} { , bb, bbbb, bbbbbb, ...} {b} = {a2n: n ≥ 0} {b2m: m ≥ 0} {b} = {a2nb2m+1; n ≥ 0, m ≥ 0}

Page 12: Regular Grammars

Regular Grammars

Regular Grammars are two types as follows:

1) Right-Linear Grammar: A grammar G = (V, T, S, P) is said to be right-linear if all productions are of the form; A xB, A x, Where A, B ϵ V, and x ϵ T *2) Left-Linear Grammar: A grammar G = (V, T, S, P) is said to be Left-linear if all productions are of the form;

A Bx, A x,Where A, B ϵ V, and x ϵ T *

V: finite set of non-terminals (upper case)T: finite set of terminals (lower case)S: Start symbolP: finite set of rewriting rules of the form A-> xB or A-> x, where A and B stand for non-terminals and x stands for a terminal

Page 13: Regular Grammars

Example : 1) The grammar G1= ({S}, {a, b}, S, P1), with P1 given as S abS|a,

It is right-linear.2) The Grammar G2 =({S,S1,S2}, {a, b}, S, P2) with productions S S1ab, S1S1ab|S2, S2 a,

It is left-linear. Both G1 and G2 are regular grammars.

Example: Write the regular expression generated by these; 1) S abS ababS ababa r= (ab)*a2) SS1ab S1abab S2abab aabab r= a(ab)*

Example: The grammar G= ({S, A, B},{a, b}, S, P), with production SA, AaB|λ, BAb. Is it a regular language?Solution: It is not a regular language because it is neither right-liner not left-linear.

Page 14: Regular Grammars

Homomorphism: Suppose ∑ and T are alphabets. Then a function f : ∑ T* is called a homomorphism. In words, a homomorphism is a substitution in which a single letter is replaced with a string. The domain of the function h is extended to strings in an obvious fashion if w= a1a2a3…an. Then h(w)=h(a1)h(a2)h(a3)……h(an).

Remark: if L is a language on ∑, then its homomorphism image is defined as h(L) = {h(w): w ϵ L}.

Page 15: Regular Grammars

Example: let ∑ = {a, b} and T= {a, b, c} and define h by h(a)= ab, h(b) = bbc. Find the homomorphic image of L={aa,aba}, h(L).Solution:

• h(aa) = abab, • h(aba) = abbbcab,

The homomorphic image of L={aa,aba} is the language h(L) = {abab, abbbcab}

Example: let ∑ = {a, b} and T= {b, c, d} and define h byh(a)= dbcc, h(b) = bdc. If L is the regular language denoted by r = (a + b*)(aa)*. Find the regular language h(L).Solution: Since r = (a + b*)(aa)*. Then r’ = (dbcc+ (bdc)* (dbccdbcc)*denotes the regular language h(L).

Page 16: Regular Grammars

The Chomsky Hierarchy

The Chomsky Hierarchy: Noam Chomsky, a founder of formal language theory, provided an initial classification into four language types, type 0, 1, 2, and 3, described as; Type 0 : Type 0 languages are those generated by unrestricted grammars, that is, the recursively enumerable languages. It is denoted as LRE. Type 1 : Type 1 consists of the context-sensitive languages. It is denoted as LCS.Type 2 : Type 2 consists of the context-free languages. It is denoted as LCF.Type 3 : Type 3 consists of the regular languages. It is denoted as LREG.

Page 17: Regular Grammars

The relationship between these types is shown in the diagram. It is clear that LREG ⊆ LCF ⊆ LCS ⊆ LRE.

Page 18: Regular Grammars

Home Work

Q1: Find all strings in L((a+ b)*b(a + ab)*) of length less than four.Q2: if r= ((0+1)(0+1)*)*00(0+1)*,Give the language L(r).Q3:Give regular expressions for the following languages on {a,b,c}.

a) All strings containing exactly one a.b) All strings containing no more than three a’sc) All strings that contain at least one occurrence of each symbol in a given set.Q4: Find a regular grammars that generates the language L(aa*(ab+a)*) and L((aab)*ab) .Q5: What are the strings generated by the regular expressions 10*, (10)*, (0 + 01), 0(0+1)*, and (0*1)* .Q6: Solve questions 3, 4, 5, and 6 at page DMA-826.