Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter...

85
Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008

Transcript of Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter...

Page 1: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Chapter 12: Regular Expressions andFinite-State Automata

March 10, 2008

Page 2: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Outline

1 12.1 Formal Languages and Regular Expressions

2 12.2 Finite-State Automata

3 7.3 Pigeonhole Principle

4 5.4 Russell’s Paradox and the Halting Problem

5 7.5 Cardinality of Sets

Page 3: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Formal Languages

DefinitionAn alphabet Σ is a finite set of characters (symbols).

Examples

1 ΣE = {a, b, c, . . . , X , Y , Z} is the usual alphabet for Englishlanguage.

2 Computer languages use a slightly richer alphabet, whichis called ASCII,

ΣASCII = ΣE ∪ {!,@, $, . . . , ?}

3 The real language of computers is based on the binaryalphabet

Σ0 = {0, 1}

Page 4: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionA string over an alphabet Σ is either(a) the empty string ε; or(b) any ordered n-tuple of elements from Σ, written without

commas or parentheses.

Examples

1 If Σ = Σ0, then ε, 0, 1, 00, 01, 10, 11, 011000101, . . . areall strings over Σ.

2 If Σ = ΣE , then “in”, “fvedwyf”, “string” are all strings overΣ.

Page 5: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionThe length of a string is the number of characters that appear inthe string. By default, we define the length of the empty string,ε, to be zero.

Examples

1 Over the language Σ0 = {0, 1}, the string

011000101

has the length 9.2 Over ΣE , the usual English alphabet, the string “string” has

length 6.

Page 6: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionSuppose Σ is an alphabet and n a non-negative integer. Then• Σn = the set of all strings over Σ that have length n• Σ+ = the set of all strings over Σ whose length is at least 1• Σ∗ = the set of all strings over Σ

It is not difficult to see that

Σ∗ = Σ0 ∪ Σ1 ∪ Σ2 ∪ . . . ∪ Σn ∪ . . .

Page 7: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleSuppose the alphabet is Σ = Σ0 = {0, 1}. Then

1 Σ0 = {ε},2 Σ1 = {0, 1},3 Σ2 = {00, 01, 10, 11},4 Σ3 = {000, 001, 010, 100, 011, 101, 110, 111},

etc.

Page 8: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionGiven an alphabet Σ, a formal language over Σ is any fixedsubset L of Σ∗.The members of L are called the words of the language.

Examples

(a) Given the usual English alphabet ΣE ,

L = {English words}

(b) Given Σ0 = {0, 1},

L = {x ∈ Σ∗ | x ends in 11} = {11, 011, 111, 0011, 1011, . . .}

(c) Given Σ = {a, b} a palyndrome is a word which is equal toits reverse:

L = {x ∈ Σ∗ | x is a palyndrome }

= {a, b, aa, bb, aaa, aba, bab, bbb, abba, . . .}

Page 9: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Operations on Formal LanguagesDefinitionIf Σ is an alphabet, and x and y are two strings over Σ, then theconcatenation of x and y is the string obtained by juxtaposingthe characters of both words.

ExampleFor Σ = {0, 1}, consider

x = 010, y = 11011

Their concatenation is

xy = 01011011

Notice that, in general, the concatenation of strings is notcommutative; i.e.

xy 6= yx

Page 10: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionFor any languages L and L′ over some alphabet Σ, we candefine new languages as follows:• Concatenation of L and L′:

LL′ = {xy | x ∈ L and y ∈ L′}

• Union of L and L′:

L ∪ L′ = {x | x ∈ L or x ∈ L′}

• Kleene closure of L:

L∗ = {x | x is a conactenation of a finite number of words from L}

[ε is in L∗ since it is a concatenation of zero strings from L.]

Page 11: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleLet

L1 = {0, 01}, L2 = {1}

Then• L1L2 = {01, 011}• L2L1 = {10, 101}• L1 ∪ L2 = {0, 01, 1}• L∗1 = {ε, 0, 01, 00, 001, 010, 0000, 0100, 01001, . . .}• L∗2 = {ε, 1, 11, 111, 1111, . . .}

Page 12: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Regular Expressions

Definition(Regular Expression) Suppose Σ is a finite alphabet. Then, thefollowing are regular expressions over Σ:

I BASE: ∅, ε, and each individual symbol from Σ.II RECURSION:If r and s are regular expressions over Σ,

then so are:(a) (rs)(b) (r ∨ s) (also written as (r + s))(c) (r∗)

III RESTRICTION:Nothing else is a regular expression overΣ, except for the objects from I and II.

Page 13: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Examples

(a) If Σ = {a, b},((a ∨ b)∗(a ∨ ε))

is a regular expression over Σ.(b) If Σ = {0, 1}, one example of a regular expression over Σ

is((0∗)(1∗))

• If the context is clear we can omit unnecessary brackets;e.g. the two examples above can be written as:

(a ∨ b)∗(a ∨ ε), 0∗1∗

• The operation of highest priority is the Kleene closure ∗,followed by concatenation, while ∨ has the lowest priority.

Page 14: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

• Regular expressions define certain languages, which arecalled regular languages.

• Given a regular expression r over some fixed finitealphabet Σ, what is the language L(r) of words from Σ∗

defined r?

Page 15: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionFor any finite alphabet Σ, we can associate a language L(r) toa regular expression r over Σ. L(r) is called the languagedefined by r .

I BASE: L(∅) = ∅, L(ε) = {ε}, and L(a) = {a}, for everya ∈ Σ.

II RECURSION: If L(r), L(r1) and L(r2) are the languagesdefined by regular expressions r , r1 and r2 over Σ, then(a) L(r1r2) = L(r1)L(r2)(b) L(r1 ∨ r2) = L(r1) ∪ L(r2)(c) L(r∗) = (L(r))∗

Page 16: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExamplesFind regular expressions for the following languages:(a) L = {x ∈ {0, 1}∗ | x begins with a 1}

1(0 ∨ 1)∗

(b) All strings with at least one 1

(0 ∨ 1)∗1(0 ∨ 1)∗

(c) All strings of length two or three over the alphabetΣ = {x , y , z}

(x ∨ y ∨ z)(x ∨ y ∨ z)(ε ∨ x ∨ y ∨ z)

Page 17: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

(d) All strings over {0, 1} which have no repeated 1’s.

(10 ∨ 0)∗(ε ∨ 1)

(e) All strings in which the number of 1’s is even.

(0 ∨ 10∗1)∗

Page 18: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExamplesDescribe the languages that correspond to the following regularexpressions:(a) (0 ∨ 1)∗1

L = {x ∈ {0, 1}∗ | x ends in a 1}

(b) (a ∨ b)∗c(a ∨ b)∗c(a ∨ b)∗, where Σ = {a, b, c}All strings with exactly two c’s.

(c) ((a ∨ b)∗c(a ∨ b)∗c(a ∨ b)∗)∗, where Σ = {a, b, c}All strings with an even number of c’s.

Page 19: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleDetermine, in each case, whether the two regular expressionsdefine the same language.

(a) (a ∨ ε)∗ and a∗.Yes. Since aε = a = εa, the language defined by the firstexpression is the set of all strings that result fromconcatenating a with itself a finite number of times, whichis the same language as the one defined by the secondregular expression.

(b) 0∗ ∨ 1∗ and (01)∗.No. For example, the string 00 is in the language definedby the first expression, but not in the second language.

Page 20: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Finite-State Automata

• A finite-state automaton is an idealized (theoretical)version of a sequential computational circuit.

• Roughly speaking, a finite state automaton is a machinewhose memory can store a finite amount of informationregarding its prior input, and based on that information andthe current input symbol, one can predict its output.

Page 21: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionA finite-state automaton A consists of five objects:

1 A set I, called the input alphabet (or, input symbols)2 A set S of states that the automaton can be in3 A designated state s0 called the initial state4 A designated set of states called the accepting (or,

terminal) states5 A next-state (or, transition) function

N : S × I → S

which, based on the current state s that the automaton isin, and the current input symbol m computes the next states′ the automaton will be in:

N(s, m) = s′.

Page 22: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

• The easiest way to visualize a finite-state automaton is touse the so-called transition diagram.

• In such a diagram, states are represented by circles andaccepting states by double circles. There is an arrow thatpoints to the initial state and other arrows are labelled withinput symbols and point from each state to other states inthe following way:

if N(s, m) = s′, then there is an arrow labelled by mpointing from s to s′.

Page 23: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleConsider the finite-state automaton A defined by the followingtransition diagram:

s0 s1 s2

1

1

1

0

0

0

Page 24: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

(a) The states are S = {s0, s1, s2}.(b) The input symbols are I = {0, 1}.(c) The initial state is s0.(d) The only accepting state is s2.(e) The next-state function is given by the following table:

N 0 1s0 s1 s0s1 s1 s2s2 s1 s0

Page 25: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleConsider the finite-state automaton A defined by the followingtransition table:

N a b cU Z Y YV V V VY Z V YZ Z Z Z

and suppose the input state is U, while the accepting states areV and Z .Draw the transition diagram for this automaton.

Page 26: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

U V

Z Y

a

a

a

a

b

b

b

bc

c

c

c

Page 27: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

• We can simplify this diagram by condensing all arrowspointing from one state to some other fixed state into asingle arrow with several labels:

U V

Z Y

a

a

b,c

a,b,c

a,b,c

b

c

Page 28: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

The Language Accepted by anAutomaton

• Suppose a string of input symbols

s1s2s3 . . . sn . . .

is being fed into a finite-state automaton.• After reading each input symbol si the automaton changes

the state, and ends up in either an accepting or anon-accepting state.

• In this way, the automaton separates the set of all inputsstrings into two subsets: those that force the automatoninto an accepting state and those that not.

• Those strings that send the automaton into an acceptingstate are said to be accepted by the automaton.

Page 29: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionLet A be a finite-state automaton with the set of input symbols I.Let I∗ be the set of all strings over the alphabet I. Suppose w isa string in I∗. Then w is accepted by A if, and only if, A goes intoan accepting state when the symbols of w are input into A in asequence from left to right, with A starting from its initial state.The language accepted by A, L(A), is the set of all stringsaccepted by A.

Page 30: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleConsider the finite-state automaton from our original example:

s0 s1 s2

1

1

1

0

0

0

Page 31: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

(a) To what states does A go if the symbols of the followingstrings are input to A in sequence, starting from s0?

(i) 01, (ii) 0011, (iii) 0101100, (iv) 10101

(i) s2, (ii) s0, (iii) s1, (iv) s2

(b) Which of the strings in part (a) send A to an acceptingstate?

01, 10101

(c) What is the language accepted by A?L(A) = All strings ending in 01.

(d) Is there a regular expression that defines the samelanguage?

(0 ∨ 1)∗01

Page 32: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Eventual-State Function

• Suppose we input some string

w = s1s2 . . . sn

into a finite-automaton, and not just a single symbol.• What will be the state that the automaton will enter

eventually?• To answer that question, we need to introduce the

eventual-state function.

Page 33: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionLet A be a finite-state automaton with the set of states S, inputsymbols I, and next-state function

N : S × I → S

Let I∗ be the set of all strings over I, and define theeventual-state function

N∗ : S × I∗ → S

so that, if w is a string from I∗,

N∗(s, w) = the state to which A goes if the symbols from w areinput into A in sequence, starting with A being in the state s.

Page 34: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleConsider, again, the automaton

s0 s1 s2

1

1

1

0

0

0

What is N∗(s1, 01100)?Solution:

s10−→ s1

1−→ s21−→ s0

0−→ s10−→ s1

so:N∗(s1, 01100) = s1

Page 35: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

• Suppose A is a finite-state automaton with input symbols Iand next-state function N, and let I∗ be the set of all stringsover I, with w a string in I∗,

w is accepted by A ⇔ N∗(s0, w) is an accepting state of A

• The language accepted by A is

L(A) = {w ∈ I∗ | N∗(s0, w) is an accepting state of A}

Page 36: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Constructing a Finite-State Automaton

Example

(a) Construct a finite-state automaton A which accepts the setof all strings over {0, 1} such that the number of 1’s in thestring is divisible by 3

(b) Find a regular expression that defines this set of strings.

Solution: Suppose the initial state is s0. We want A to keeptrack of how many 1’s have been input up to that point, so weneed at least two more states: s1, s2. A will be in s1, if thenumber of 1’s is one, and s2 if the number of 1’s that have beenscanned is two. Since the empty string contains a number of1’s (zero) that is divisible by 3, we want to make s0 an acceptingstate.

Page 37: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

So, we want A to behave as follows:

s01−→ s1

1−→ s21−→ s0

If, at some point, A encounters a 0 in the string, this has noeffect on what on the number of 1’s, so A doesn’t need tochange its current state; i.e.

s00−→ s0, s1

0−→ s1, s20−→ s2

Page 38: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

a

s0 s1

s2

1

11

0

0

0

(b) The regular expression for this language is:

0∗ ∨ (0∗10∗10∗10∗)∗

Page 39: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Example

(a) Construct a finite-state automaton A to accept the set of allstrings over {0, 1} which contain exactly one symbol 1.

(b) Find a regular expression for this language.

Solution: We start with two states:s0: initial state of A;s1: the state which A enters when the input string containsexactly one 1. (so, s1 is an accepting state)

Page 40: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

We can start in this way:

s00−→ s0, s0

1−→ s1

If A is in state s1 and reads another 0, it can remain in thatstate, since the number of 0’s is irrelevant.If A is in state s1, and reads another 1, we should not put

s11−→ s0

the reason being that such a string should not be acceptedlater, but it is possible to get from s0 to s1.Instead, if we have at least two 1’s, we want A to enter a“dead-end” state s2 from which it cannot get to either s0 or s1.

Page 41: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

s1 s2s011

0 00 ,1

(b) The corresponding regular expression is:

0∗10∗

Page 42: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Finite-State Automata and RegularLanguages

• We have seen, in the previous two examples, that, given afinite automaton, it may be possible to find a regularexpression that defines the language accepted by theautomaton.

• This is true for any finite-state automaton.

Theorem(Kleene’s Theorem - Part 1) Given any language that isaccepted by a finite-state automaton, there is a regularexpression that defines the same language.

Page 43: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Theorem(Kleene’s Theorem - Part 2) Given any language defined by aregular expression, there is a finite-state automaton thataccepts the same language.

• Therefore, the class of all languages defined y regularexpression is identical with the class of all languagesaccepted by finite-state automata.

• Such languages are called regular languages.

Page 44: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

• Question: are there languages over finite alphabets thatare not regular?

• Answer: Yes. In fact, regular languages are rather specialand most of the languages we work with require morecomplicated recognizers (machines), e.g. pushdownautomata, Turing machines, random access machines, etc.

• Next, we will give an example of a relatively simplelanguage over {a, b}, which is not regular. To do that, wewill need a theorem from 7.5.

Page 45: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Theorem(Pigeonhole Principle) If we want to put n pigeons in m holes,and n > m, then there must be at least one hole with more than1 pigeon in it.

• This is an informal version of a combinatorial principlewhich asserts that, if we want to distribute n objects in mcategories, with n > m, then at least one category willcontain more than one object.

• We will discuss this principle (theorem) in more detail in7.5.

Page 46: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleLet L be the language consisting of all strings

akbk , k ≥ 1

over Σ = {a, b}; i.e.

L = {s ∈ Σ∗ | s = akbk , k ≥ 1}

Show that L is not a regular language.Solution: We will prove this by contradiction. Suppose thatthere exists a finite-state automaton A that accepts thelanguage L.A has a finite number of states, say

s1, s2, . . . , sn

Look at the infinite sequence of strings

a, a2, a3, . . . , ak , . . .

Page 47: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

By Pigeonhole Principle, there must be a state, say sm and twoinput strings ap, aq that will cause A to end up in that samestate sm, starting from s1.We assumed that A accepts L, i.e. that the string

apbp

is accepted by A.So, the string bp will cause A to go from state sm to someaccepting state, say sa.But, in that case, the string

aqbp

will also be accepted by A. [aq first leads A from the initial states1 to sm and, then, bp causes the transition to sa.]

Page 48: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

However,aqbp

should not be accepted by A since the number of a’s and b’s isnot the same. Contradiction.Therefore, L is not a regular language; i.e. there is nofinite-state automaton that accepts it.

Page 49: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

a a sm

a a

a

a a

a

b b... ...

p a’s are input

q-p additionala’s are input

p b’s are input

Page 50: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Pigeonhole Principle

• The Pigeonhole Principle states that if n pigeons fly into mpigeonholes and n > m, then at least one hole mustcontain two or more pigeons.

• This combinatorial principle is sometimes also called theDirichlet Principle, since it was first formulated by J.P.G.L.Dirichlet (1805-1859).

Pigeonhole Principle: A function from one finite set to a smallerfinite set cannot be one-to-one. there must be at least twoelements in the domain that have the same image in thecodomain.

Page 51: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

• Suppose |X | denotes the number of elements of set X .• The Pigeonhole Principle can then be written as: given a

functionf : X → Y , |X | > |Y |

we have

∃x1, x2 (x1 6= x2 ∧ f (x1) = f (x2)).

Page 52: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleIn a group of 13 people, must there be two people born in thesame month? What about a group of 10 people?Solution: Let

X = {People in the group}, Y = {Months of the year}

and define the function

f : X → Y , f (x) = the month of x ’s birth

Since|X | = 13 > 12 = |Y |

by the Pigeonhole Principle, there are distinct x1, x2 such that

f (x1) = f (x2)

i.e. x1 and x2 have the same month of birth.

Page 53: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

In a group of 10 people, |X | = 10, |Y | = 12, so

|X | 6> |Y |

and the Pigeonhole Principle does not apply.Therefore, there is no guarantee that two people would be bornin the same month of the year.

Page 54: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleAre there at least two people in the Metro Toronto area with thesame number of hairs on their head?Solution: Let T be the set of all people in the Metro Torontoarea. Then,

|T | ≈ 3× 106.

letHx = set of hairs on the head of x ∈ T

On the other hand, the maximum number of hairs on a humanhead is less than 300,000, so, for every x ∈ T

|Hx | < 300, 000

Then,maxx∈T

|Hx | < 300, 000.

Page 55: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

letf : T → {0, 1, . . . , 300000}

be defined as

f (x) = |Hx | = the number of hairs on x ’s head

Since|T | > max

x∈T|Hx |,

there must be two members of T , which have the same numberof hairs.

Page 56: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleA drawer contains 10 red and 10 blue socks. If we pull socks atrandom, how many must we pick in order to guarantee 2 of thesame color?Solution: Let

X = { Socks pulled }, Y = { Red, Blue}.

Letf : X → Y , f (x) = colour of x

we want the size of X to be such that it guarantees (by thePigeonhole Principle) that there will be at least two distinctelements of X for which the image is the same.To accomplish that, we need

|X | > |Y | = 2

so we need to pick at least 3 socks.

Page 57: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Generalized Pigeonhole Principle

Generalized Pigeonhole Principle: For any function f from afinite set X to a finite set Y and any positive integer k , if

|X | > k · |Y |,

there is some y ∈ Y that is the image of at least k + 1 distinctelements of X .

Page 58: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Examples

(a) In a group of 85 people, what is the minimum number whohave the same first initial?

Number of initials = 26, 85 = 26 · 3 + 7, so at least 4.

(b) What is the minimum number of people with the samenumber of hairs on their head in the Metro Toronto area?Since

|T | ≈ 3× 106

and∀x ∈ T , |Hx | < 300, 000

we have|T | > 10 max

x∈T|Hx |,

there are at least 11 people with the same number of hairson their head

Page 59: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Russell’s Paradox

• At the beginning of this century Alfred Whitehead (1861 -1947) and Betrand Russell (1872 - 1970) attempted to abook, Principia Mathematica, which would develop all ofmathematics, staring from some basic principles of settheory. The scope of this work was incredible, the proofthat 1 + 1 = 2 does not appear until page 362.

• The work itself was flawed; there were theorems that theywere not able to address using only the basic principlesthey had assumed.

• The basic inconsistency is what is now known as Russell’sParadox

Page 60: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

• We have seen before that a set can be an element ofanother set; e.g.

{{a, b}, c}

• Most often, a set is not an element of itself. However, thereis no principle of set theory which would prohibit that.

• Consider the universal set U and suppose that everyobject is an element of U.

• Then, U must be an element of itself

U ∈ U

Page 61: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

• Let S be a set of all sets that are not elements ofthemselves:

S = {A|A ⊆ U and A 6∈ A}

• Now, S is either an element of itself or not:

S ∈ S or S 6∈ S

• If S ∈ S, then, by definition of S,

S 6∈ S

which is a contradiction.• On the other hand, if S 6∈ S, then

S ∈ S

(by definition of S). Contradiction.• We have obtained a paradox• The only way out is to conclude that S 6⊆ U; i.e.S is not a

set in the universe.

Page 62: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

The Barber Puzzle

In a certain town there is a male barber who shaves all thosemen, and only those men, who do not shave themselves.Question: Does the barber shave himself?

Answer: Neither yes nor no.If the barber shaves himself, then, by assumption, he does notshave himself, which is a contradiction.If he doesn’t shave himself, he is shaved by the barber(himself). Again, we reached a contradiction.

Conclusion: Such a situation cannot exist in the real world.

Page 63: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Figure: R. Magritte - Ceci n’est pas une pipe

Page 64: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

The Halting Problem

• We wish to design an algorithm, H which will take as inputan algorithm A, and a potential input to that algorithm wand decide whether A eventually halts on the input w .

• The question of whether such an algorithm H exists or notis known as the Halting Problem.

• The existence of such an algorithm would be very useful forchecking whether our programs ever enter an infinite loop.

Page 65: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

• An algorithm is, ultimately, a sequence of characters,presumably in some appropriate programming language.

• This sequence can be used as an input to anotheralgorithm.

• Indeed by asking the halting problem we implicitlyassumed that the algorithm A could be an input for H.

• This is similar to the situation where a set can be anelement of another set.

• In particular if A expects an algorithm for input, w could bethe encoding of another algorithm.

Page 66: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

TheoremThere is no algorithm which will take as input an algorithm A,and a potential input to that algorithm w and decide whether Aeventually halts on the input w.Proof. We prove this by contradiction.Suppose that such an algorithm exists, H(A, w).Then, H(A, w) will output:• Halts, if A halts on input w• Loops, if A does not halt on input w

Page 67: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Now we design a new algorithm D, which uses H as a‘subroutine’. The input to D is the encoding of an algorithm A.On input A, D runs H on < A, A >. i.e. H determines theoutcome of running the algorithm A on an encoding of itself.D then reverses the output from H and acts as follows:• D(A) loops forever, if A halts on input A.• D(A) halts, if A loops forever on input A.

Page 68: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Now, run D on itself to derive a contradiction:

• D(D) loops forever, if D halts on input D. Contradiction.• D(D) halts, if D loops forever on input D. Contradiction,

again.

We see that our original assumption, that H exists, is wrong. �

Page 69: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Cardinal vs Ordinal Numbers

• Cardinal number: describes the size of the set.• Ordinal number: refers to the order of an element in a

sequence (e.g. the sixth element in the enumeration ofsome set or sequence)

• In order to define cardinal numbers, we need to reviewsome things about functions.

Page 70: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionA one-to-one correspondence, or a bijection between two setsA and B is a function

f : A → B

with the following two properties:1 f is one-to-one; i.e. two distinct elements of A cannot have

the same image in B

∀x , y ∈ A (x 6= y → f (x) 6= f (y))

2 f is onto; i.e. every element of B is an image of someelement of A.

∀y ∈ B ∃x ∈ A f (x) = y

Page 71: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Definition

1 We say that two sets A and B have the same cardinality ifthere exists a bijection f : A → B.

2 If B = {1, 2, . . . , n}, for some fixed n ∈ Z+, and there is abijection

f : A → B

we say that A is of size n (or of cardinality n) and write thatas |A| = n

3 If |A| = n, for some n ∈ Z+, we say that A is a finite set.

Page 72: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

• Clearly, if f : A → B is a bijection, then there is an inversefunction (which is also a bijection)

f−1 : B → A

• Also the composition of two bijection is also a bijection;namely, if

f : A → B, g : B → C

are both bijections, so is

g ◦ f : A → C, where (g ◦ f )(x) = g(f (x))

• As a consequence, if A has the same cardinality as B andB has the same cardinality as C, then A and C will be ofthe same cardinality.

Page 73: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Countable Sets

• So far, our main distinction between different sets was byseparating them into finite and infinite sets.

• Georg Cantor introduced the notion of cardinal numbers asa means of defining infinite sets.

• Based on this idea, it is possible to measure to what extenta set may be “infinite”; i.e. is there any difference in theway set of all integers appears to be infinite, as opposedto, say, the set of all rational (or,even, real) numbers?

• For example, the set Z is discrete, whereas the real line(the usual way we think about R) has no “gaps”.

Page 74: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

DefinitionWe say that a set A is countably infinite if, and only if, it has thesame cardinality as the set of all positive integers Z+

A set is called countable if it is either finite or countably infinite.

Page 75: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Example(Even Numbers) Consider the set of positive even numbers

E = {2, 4, 6, . . . , 2n, . . .}

Consider the function

f : Z+ → E , f (n) = 2n

It is easy to show that this function is a bijection (exercise).Therefore, the positive even numbers are an example of acountably infinite set.

• This example also shows that a countably infinite set canhave a proper subset of the same cardinality, which is notthe case for finite sets.

Page 76: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Example(Z - Integers) We will show that the set of all integers iscountable.Consider the function

f : Z+ → Z, f (n) =

{ n2 , n is even−n−1

2 , n is odd

The output of the function looks like this:

n 1 2 3 4 5 . . .f (n) 0 1 -1 2 -2 . . .

Again, it is relatively easy to show that this function is abijection, which shows that Z is countable.

Page 77: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleShow that the set of all positive rational numbers Q+ iscountable.In order to count the positive rational numbers we create aninfinite table on which to count, the denominator increases aswe move to the right, and the numerator increases as we movedown (see the figure). This enumerates all possible values ofnumerator and denominator.

Page 78: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

The output of the function looks like

n 1 2 3 4 5 6 . . .f (n) 1

112

21

31

22

13 . . .

• This method of showing that Q+ is countable is calledCantor’s diagonalization process.

• Using the same approach we used to show that Z iscountable, we can show that the set of all rational numbersis countable; we need to construct a bijection between Qand Q+ (Exercise.)

Page 79: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Theorem(Cantor) The set of all real numbers between 0 and 1 isuncountable; i.e. there is no bijection

f : Z+ → (0, 1)

Proof. We will prove this by contradiction, by assuming thatsuch a bijection exists

f : Z+ → (0, 1)

This means that we can index the numbers between 0 and 1into a sequence on a large sheet of paper.An example of what such a list may look like is on the next slide.

Page 80: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

n f (n)1 0.1986759143598725309861532. . .2 0.6569872345023458796234509. . .3 0.2938745723450972345234534. . .4 0.9854918273450912346598764. . .5 0.1987523444098234734598723. . .6 0.2341897123487912349876123. . ....

...

Page 81: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

Construct a new decimal number

d = 0.d1d2d3 . . . dn . . .

between 0 and 1 as follows:

dn =

{1, if the n-th digit of f (n) is not 10, if the n-th digit of f (n) is 1

In this way, we get a number which is between 0 and 1, but itcannot appear in the list above since, for every n ≥ 1, it differsin the n-th decimal place from the n-th number in the list.Contradiction. Therefore, such a bijection (sequence) cannotexist, and (0, 1) is not countable.

Page 82: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleShow that the interval (0, 1) and the set of all real numbers Rhave the same cardinality.Solution: We will construct a bijection

f : (0, 1) → R

in the following way:

f (x) = tan((x − 12)π)

Notice that the function x → (x − 12)π transforms the interval

(0, 1) into another interval

(−π

2,π

2)

and the tangent function is a bijection between this interval andR.

Page 83: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

TheoremAny subset of a countable set is countable.

The proof of this theorem is relatively easy; see p.451.

• One consequence of this theorem is the following:

Any set with an uncountable subset must be uncountable.

Page 84: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

ExampleLet T be a set of all functions from positive integers to the set

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

Show that T is uncountable.Solution: We are going to construct a bijection

F : (0, 1) → T

Suppose0.a1a2a3 . . . an . . . ∈ (0, 1)

Define

F (0.a1a2a3 . . . an . . .) = function sending each n ≥ 1 into then-th digit an

Page 85: Chapter 12: Regular Expressions and Finite-State Automatamth314/W08/Slides/Chapter12.pdf · Chapter 12: Regular Expressions and Finite-State Automata March 10, 2008. Outline 1 12.1

It is easy to check that F is a bijection between all numbersfrom (0, 1) and all functions from positive integers into the set ofdecimal digits.

Therefore, T is uncountable, since (0, 1) is such.

• The sets which have the same cardinality as (0, 1) (or R)are said to be of cardinality continuum.

• The question as to whether there are infinite cardinalitieslying between countable sets and sets of cardinalitycontinuum is very difficult and is known to be independentof basic principles of set theory.