Finite Automata and Regular Expressions Haniel hbarbosa/teaching/cs4330/...آ  Regular Expressions...

download Finite Automata and Regular Expressions Haniel hbarbosa/teaching/cs4330/...آ  Regular Expressions A

of 25

  • date post

    23-Aug-2020
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of Finite Automata and Regular Expressions Haniel hbarbosa/teaching/cs4330/...آ  Regular Expressions...

  • CS:4330 Theory of Computation Spring 2018

    Regular Languages Finite Automata and Regular Expressions

    Haniel Barbosa

  • Readings for this lecture

    Chapter 1 of [Sipser 1996], 3rd edition. Sections 1.1 and 1.3.

  • Abstraction of Problems

    B Data: abstracted as a word in a given alphabet

    I Σ: alphabet, a finite, non-empty set of symbols I Σ∗: all the words of finite length built up using Σ

    B Conditions: abstracted as a set of words, called language

    I Any subset L⊆ Σ∗

    B Unknown: Implicitly a Boolean variable: true if a word is in the language, false otherwise

    I Given w ∈ Σ∗ and L⊆ Σ∗, does w ∈ L?

    1 / 14

  • Finite Automata

    B The simplest computational model is called a finite state machine or a finite automaton

    B Representations:

    I Graphical I Tabular I Mathematical

    2 / 14

  • Computation of a Finite Automaton

    B The automaton receives the input symbols one by one from left to right, changing the “active” state

    B After reading each symbol, the “active state” moves from one state to another along the transition that has that symbol as its label

    B When the last symbol of the input is read the automaton produces the output: accept if the “active state” is in an accept state, or reject otherwise

    3 / 14

  • Applications

    B Finite automata are popular in parser construction of compilers

    B Finite automata and their probabilistic counterparts, Markov chains, are useful tools for pattern recognition Example: speech processing and optical character recognition

    B Markov chains have been used to model and predict price changes in financial applications

    4 / 14

  • Formal Definition of Finite Automata

    A finite automaton is a 5-tuple (Q, Σ, δ, q0, F) in which: B Q is a finite set called the states

    B Σ is a finite set called the alphabet B δ : Q×Σ→ Q is the transition function B q0 ∈ Q is the start state, also called initial stat B F ⊆ Q is the set of accepted states, also called the final states

    5 / 14

  • Language Recognized by an Automaton

    B If L is the set of all strings that a finite automaton M accepts, we say that L is the language of the machine M and write L (M) = L

    B An automaton may accept several strings, but it always recognizes only one language

    B If a machine accepts no strings, it still recognizes one language, namely the empty language ∅

    6 / 14

  • Formal Definition of Acceptance

    Let M = (Q, Σ, δ, q0, F) be a finite automaton and w = a0 . . .an be a string over Σ

    Then M accepts w if a sequence of states r0, . . . , rn exists in Q such that:

    1. r0 = q0 2. δ(ri, ai+1) = ri+1 for i = 0,1, . . . ,n−1 3. rn ∈ F

    Condition (1) says where the machine starts

    Condition (2) says that the machine goes from state to state according to its transition function δ

    Condition (3) says when the machine accepts its input: if it ends up in an accept state

    7 / 14

  • Regular Languages

    We say that a finite automaton M recognizes the language L if L = {w |M accepts w}

    Definition: A language is called regular language if there exists a finite automaton that recognizes it.

    8 / 14

  • Designing Finite Automata

    B Whether it be of automaton or artwork, design is a creative process. Consequently it cannot be reduced to a simple recipe or formula.

    B The approach:

    I Identify the finite pieces of information you need to solve the problem. These are the states

    I Identify the condition (alphabet) to change from one state to another I Identify the start and final states I Add missing transitions

    9 / 14

  • Operations on Regular Languages

    Let A and B be languages. We define regular operations union, concatenation, and start as follows

    B Union: A∪B = {x | x ∈ A∨ x ∈ B} B Concatenation: A◦B = {xy | x ∈ A∧ y ∈ B} B Star: A∗ = {x1 . . .xk | k ≥ 0∧ xi ∈ A, 0≤ i≤ k}

    Note:

    1. � ∈ A∗, no matter what A is 2. A+ denotes A◦A∗

    10 / 14

  • Regular Expressions

    A regular expression (RE in short) is a string of symbols that describes a regular language.

    B Three base cases:

    I For any a ∈ Σ, a is a regular expression denoting the language {a} I � is a regular expression denoting the language {�} I ∅ is a regular expression denoting the language ∅;

    B Three recursive cases: If r1 and r2 are regular expressions denoting languages L1 and L2, respectively, then I Union: r1∪ r2 denotes L1∪L2 I Concatenation: r1r2 denotes L1 ◦L2 I Star: r∗1 denotes L

    ∗ 1

    11 / 14

  • Some useful notation

    Let r be a regular expression:

    B The string r+ represents rr∗, and it also holds that r+∪{�}= r∗

    B The string rk represents rr . . .r︸ ︷︷ ︸ k times

    B Recall that the symbol Σ represents the alphabet {a1, . . . , ak}

    B As with automata, the language represented by r is denoted L (r)

    12 / 14

  • Precedence Rules

    B The star (∗) operation has the highest precedence

    B The concatenation (◦) operation is second on precedence order

    B The union (∪) or (+) operation is the least preferred

    B Parenthesis can be omitted using these rules

    13 / 14

  • Examples

    0∗10∗ =

    {w | w contains a single 1} Σ∗1Σ∗ = {w | w has at least a single 1} Σ∗(101)Σ∗ = {w | w contains 101 as a substring} 1∗(01+)∗ = {w | every 0 in w is followed by at least a single 1} (ΣΣ)∗ = {w | w is of even length}

    0Σ∗0∪1Σ∗1∪0∪1 − all words starting and ending with the same letter (0∪ �)1∗ = 01∗∪1∗ − all strings of forms 1,11,1 . . .1 and 0,1,11,1 . . .1 R∅ − ∅ ∅∗ − {�}

    14 / 14

  • Examples

    0∗10∗ = {w | w contains a single 1} Σ∗1Σ∗ =

    {w | w has at least a single 1} Σ∗(101)Σ∗ = {w | w contains 101 as a substring} 1∗(01+)∗ = {w | every 0 in w is followed by at least a single 1} (ΣΣ)∗ = {w | w is of even length}

    0Σ∗0∪1Σ∗1∪0∪1 − all words starting and ending with the same letter (0∪ �)1∗ = 01∗∪1∗ − all strings of forms 1,11,1 . . .1 and 0,1,11,1 . . .1 R∅ − ∅ ∅∗ − {�}

    14 / 14

  • Examples

    0∗10∗ = {w | w contains a single 1} Σ∗1Σ∗ = {w | w has at least a single 1} Σ∗(101)Σ∗ =

    {w | w contains 101 as a substring} 1∗(01+)∗ = {w | every 0 in w is followed by at least a single 1} (ΣΣ)∗ = {w | w is of even length}

    0Σ∗0∪1Σ∗1∪0∪1 − all words starting and ending with the same letter (0∪ �)1∗ = 01∗∪1∗ − all strings of forms 1,11,1 . . .1 and 0,1,11,1 . . .1 R∅ − ∅ ∅∗ − {�}

    14 / 14

  • Examples

    0∗10∗ = {w | w contains a single 1} Σ∗1Σ∗ = {w | w has at least a single 1} Σ∗(101)Σ∗ = {w | w contains 101 as a substring} 1∗(01+)∗ =

    {w | every 0 in w is followed by at least a single 1} (ΣΣ)∗ = {w | w is of even length}

    0Σ∗0∪1Σ∗1∪0∪1 − all words starting and ending with the same letter (0∪ �)1∗ = 01∗∪1∗ − all strings of forms 1,11,1 . . .1 and 0,1,11,1 . . .1 R∅ − ∅ ∅∗ − {�}

    14 / 14

  • Examples

    0∗10∗ = {w | w contains a single 1} Σ∗1Σ∗ = {w | w has at least a single 1} Σ∗(101)Σ∗ = {w | w contains 101 as a substring} 1∗(01+)∗ = {w | every 0 in w is followed by at least a single 1} (ΣΣ)∗ =

    {w | w is of even length}

    0Σ∗0∪1Σ∗1∪0∪1 − all words starting and ending with the same letter (0∪ �)1∗ = 01∗∪1∗ − all strings of forms 1,11,1 . . .1 and 0,1,11,1 . . .1 R∅ − ∅ ∅∗ − {�}

    14 / 14

  • Examples

    0∗10∗ = {w | w contains a single 1} Σ∗1Σ∗ = {w | w has at least a single 1} Σ∗(101)Σ∗ = {w | w contains 101 as a substring} 1∗(01+)∗ = {w | every 0 in w is followed by at least a single 1} (ΣΣ)∗ = {w | w is of even length}

    0Σ∗0∪1Σ∗1∪0∪1 −

    all words starting and ending with the same letter (0∪ �)1∗ = 01∗∪1∗ − all strings of forms 1,11,1 . . .1 and 0,1,11,1 . . .1 R∅ − ∅ ∅∗ − {�}

    14 / 14

  • Examples

    0∗10∗ = {w | w contains a single 1} Σ∗1Σ∗ = {w | w has at least a single 1} Σ∗(101)Σ∗ = {w | w contains 101 as a substring} 1∗(01+)∗ = {w | every 0 in w is followed by at least a single 1} (ΣΣ)∗ = {w | w is of even length}

    0Σ∗0∪1Σ∗1∪0∪1 − all words starting and ending with the same letter (0∪ �)1∗ = 01∗∪1∗ −

    all strings of forms 1,11,1 . . .1 and 0,1,11,1 . . .1 R∅ − ∅ ∅∗ − {�}

    14 / 14

  • Examples

    0∗10∗ = {w | w contains a single 1} Σ∗1Σ∗ = {w | w has at least a single 1} Σ∗(101)Σ∗ = {w | w contains 101 as a substring} 1∗(01+)∗ = {w | every 0 in w is followed by at least a single 1} (ΣΣ)∗ = {w | w is of even length}

    0Σ∗0∪1Σ∗1∪0∪1 − all words starting and ending with the same letter (0∪ �)1∗ = 01∗∪1∗ − all strings of forms 1,11,1 . . .1 and 0,1,11,1 . . .1 R∅ −

    ∅ ∅∗ − {�}

    14 / 14