CS 154, Lecture 3: DFA NFA, Regular Expressions ... Reverse Theorem for Regular Languages Theorem:...

download CS 154, Lecture 3: DFA NFA, Regular Expressions ... Reverse Theorem for Regular Languages Theorem: The

If you can't read please download the document

  • date post

    25-Jan-2021
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of CS 154, Lecture 3: DFA NFA, Regular Expressions ... Reverse Theorem for Regular Languages Theorem:...

  • CS 154, Lecture 3: DFANFA, Regular Expressions

  • Homework 1 is coming out …

  • Deterministic Finite Automata

    Computation with finite memory

  • Non-Deterministic Finite Automata

    Computation with finite memory

    and “verified guessing”

  • From NFAs to DFAs

    Input: NFA N = (Q, Σ, , Q0, F)

    Output: DFA M = (Q, Σ, , q0, F)

    To learn if an NFA accepts, we could do the computation

    in parallel, maintaining the set of all possible states that

    can be reached

    Set Q = 2Q Idea:

  • Q = 2Q

     : Q  Σ → Q

    (R,) =  ε( (r,) ) rR

    q0 = ε(Q0)

    F = { R Q | f  R for some f  F }

    *

    For S Q, the ε-closure of S is

    ε(S) = {q | q reachable from some s  S by taking 0 or more ε transitions}

    *

    From NFAs to DFAs: Subset Construction

    Input: NFA N = (Q, Σ, , Q0, F)

    Output: DFA M = (Q, Σ, , q0, F)

  • Example of the ε-closure

    ε({q0}) = {q0 , q1, q2}

    ε({q1}) = {q1, q2}

    ε({q2}) = {q2}

  • a

    a , b

    a

    2 3

    1

    b

    ε

    Given: NFA N = ( {1,2,3}, {a,b},  , {1}, {1} )

    Construct: Equivalent DFA M = (2{1,2,3}, {a,b}, , {1,3}, …)

    ε({1}) = {1,3}

    N M

    {1,3}

    a

    b

    {2} a {2,3}

    b

    {3}

    a

    {1,2,3}

    ab

    b

    a 

    a,b

    {1}, {1,2} ?

    b

  • Reverse Theorem for Regular Languages

    Theorem: The reverse of a regular language is also a regular language

    If a language can be recognized by a DFA that reads strings from right to left, then there is an “normal” DFA that accepts the same language

    Proof?

    Given a DFA for a language L, “reverse” its arrows and flip its start and accept states, getting an NFA. Convert that NFA back to a DFA!

  • Using NFAs in place of DFAs can make proofs about regular

    languages much easier!

    Remember this on homework/exams!

  • Union Theorem using NFAs?

  • Regular Languages are closed under concatenation

    Given DFAs M1 and M2, connect the accept states of M1 to the

    start states of M2

    Concatenation: A  B = { vw | v  A and w  B }

    0 0,1

    00

    1

    1

    10 0,1

    00

    1

    1

    1

    ε

    ε

    L(N) = L(M1)  L(M2)

    0 0,1

    00

    1

    1

    1

  • Regular Languages are closed under star

    Let M be a DFA, and let L = L(M)

    We can construct an NFA N that recognizes L*

    A* = { s1 … sk | k ≥ 0 and each si  A }

    0 0,1

    00

    1

    1

    1

    ε

    ε

    ε

  • Formally, the construction is:

    Input: DFA M = (Q, Σ, , q1, F)

    Output: NFA N = (Q, Σ, , q0, F)

    Q = Q  {q0}

    F = F  {q0}

    (q,a) =

    (q,a)

    {q1}

    {q1}

    if q Q and a ≠ ε

    if q  F and a = ε

    if q = q0 and a = ε

    if q = q0 and a ≠ ε

     else

  • How would we prove that this NFA construction works?

    Want to show: L(N) = L*

    1. L(N)  L*

    2. L(N)  L*

    Regular Languages are Closed Under Star

  • 1. L(N)  L*

    Assume w = w1…wk is in L* where w1,…,wk L

    We show N accepts w by induction on k

    Base Cases:

    k = 0 k = 1

    Inductive Step:

    Assume N accepts all strings v = v1…vk L*, vi  L Let u = u1…ukuk+1  L* , uj L

    Since N accepts u1…uk (by induction) and M accepts uk+1, N also accepts u (by construction)

    (w = ε) (w L)

  • Assume w is accepted by N; we want to show w  L* If w = ε, then w  L*

    I.H. N accepts u and takes at most k ε-transitions  u  L* uL*

    vL

    2. L(N)  L*

    By I.H.

    w = uv L*

    uL(N), so

    Let w be accepted by N with k+1 ε-transitions.

    Write w as w=uv, where v is the substring read after the last ε- transition

    accept

    ε

    ε

    u

    v

  • Closure Properties for Regular Languages Union: A  B = { w | w  A or w  B }

    Intersection: A  B = { w | w  A and w  B }

    Complement:A = { w  Σ* | w  A }

    Reverse: AR = { w1 …wk | wk …w1  A, wi  Σ}

    Concatenation: A  B = { vw | v  A and w  B }

    Star: A* = { s1 … sk | k ≥ 0 and each si  A }

    Theorem: if A and B are regular then so are: A  B, A  B, A, AR , A  B, and A*

  • Regular Expressions Computation as simple, logical description

    A totally different way of thinking about computation: What is the complexity of describing the strings in the language?

  • Inductive Definition of Regexp

    For all  ∊ Σ,  is a regexp

    ε is a regexp

     is a regexp

    If R1 and R2 are both regexps, then

    (R1R2), (R1 + R2), and (R1)* are regexps

    Let Σ be an alphabet. We define the regular expressions over Σ inductively:

  • Precedence Order:

    * then  then +

    · R2R1*(Example: R1*R2 + R3 = ( ) ) + R3

  • Definition: Regexps Represent Languages

    The regexp  ∊ Σ represents the language {}

    The regexp ε represents {ε}

    The regexp  represents 

    If R1 and R2 are regular expressions representing L1 and L2 then:

    (R1R2) represents L1  L2

    (R1 + R2) represents L1  L2

    (R1)* represents L1*

  • Regexps Represent Languages

    For every regexp R, define L(R) to be the language that R represents

    A string w ∊ Σ* is accepted by R (or, w matches R) if w ∊ L(R)

    Examples: 0, 010, and 01010 match (01)*0

    110101110100100 matches (0+1)*0

  • { w | w has exactly a single 1 }

    0*10*

    Assume Σ = {0,1}

    { w | w contains 001 }

    (0+1)*001(0+1)*

  • What language does the regexp * represent?

    {ε}

    Assume Σ = {0,1}

  • { w | w has length ≥ 3 and its 3rd symbol is 0 }

    (0+1)(0+1)0(0+1)*

    Assume Σ = {0,1}

  • { w | every odd position in w is a 1 }

    (1(0 + 1))*(1 + ε)

    Assume Σ = {0,1}

  • 1 + 0 + ε + 0(0+1)*0 + 1(0+1)*1

    { w | w has equal number of occurrences of 01 and 10}

    = { w | w = 1, w = 0, or w = ε, or w starts with a 0 and ends with a 0, or w starts with a 1 and ends with a 1 }

    Claim: A string w has equal occurrences of 01 and 10 w starts and ends with the same bit.

    Assume Σ = {0,1}

  • L can be represented by some regexp  L is regular

    DFAs ≡NFAs ≡ Regular Expressions!

  • L can be represented by some regexp  L is regular

  • Given any regexp R, we will construct an NFA N s.t. N accepts exactly the strings accepted by R

    Proof by induction on the length of the regexp R

    L can be represented by some regexp  L is regular

    Base Cases (R has length 1):

    R = 

    R = ε

    R = 

  • Induction Step: Suppose every regexp of length < k represents some regular language.

    Three possibilities for R:

    R = R1 + R2

    R = R1 R2

    R = (R1)*

    Consider a regexp R of length k > 1

  • Induction Step: Suppose every regexp of length < k represents some regular language.

    Three possibilities for R:

    R = R1 + R2

    R = R1 R2

    R = (R1)*

    Consider a regexp R of length k > 1

    By induction, R1 and R2 represent some regular languages, L1 and L2

    But L(R) = L(R1 + R2) = L1  L2 so L(R) is regular, by the union theorem!

  • Induction Step: Suppose every regexp of length < k represents some regular language.

    Three possibilities for R:

    R = R1 + R2

    R = R1 R2

    R = (R1)*

    Consider a regexp R of length k > 1

    By induction, R1 and R2 represent

    some regular languages, L1 and L2

    But L(R) = L(R1·R2) = L1· L2 so L(R) is regular by the concatenation theorem

  • Induction Step: Suppose every regexp of length < k represents some regular language.

    Three possibilities for R:

    R = R1 + R2

    R = R1 R2

    R = (R1)*

    Consider a regexp R of length k > 1

    By induction, R1 and R2 represent

    some regular languages, L1 and L2

    But L(R) = L(R1*) = L1* so L(R) is regular, by the star theorem

  • Induction Step: Suppose every regexp of length < k represents some regular language.

    Three possibilities for R:

    R = R1 + R2

    R = R1 R2

    R = (R1)*

    Consider a regexp R of length k > 1

    By induction, R1 and R2 represent

    some regular languages, L1 and L2

    But L(R) = L(R1*) = L1* so L(R) i