60-354, Theory of Computation Fall 2013asishm.myweb.cs.uwindsor.ca/cs354/F13/ch4.pdf · 60-354,...

60-354, Theory of Computation Fall 2013

Asish Mukhopadhyay

School of Computer Science

University of Windsor

Pushdown Automata (PDA)

• PDA = ε-NFA + stack

• Acceptance

– ε-NFA enters a final state or

– Stack is empty

• Nondeterministic in nature

• More powerful than its deterministic counterpart

An example

• Design a PDA to accept the language

– L= {wwr | w in {0,1}*} by empty stack

• Main idea underlying the construction

– Keep stacking symbols from the input string

– Guess that the middle has been reached

– Now start emptying the stack, as long as stacktop and input symbol matches

Formal Description of a PDA

• A PDA is a 6-tuple: (Q, ∑, Λ, δ, q0, Z0) – Q is the set of states of the ε-NFA

– ∑ is the alphabet of the input string

– Λ is the stack alphabet

– δ is the transition function:

– q0 is the start state of the ε-NFA

– Z0 is the bottom of stack marker

*

2: QQ

PDA for the example language

• Q = {q0, q1}

• ∑ = {0, 1}

• Λ = {0, 1 ,Z0}

PDA for the example language

• δ():

– δ(q0, a, b) = {(q0, ab), (q1,ab)}, a ε{0,1}, bε{0,1,Z0}

– δ(q0, ε, Z0) = {(q1, Z0)} // accept empty string

– δ(q1, a, b) = {} , a, b ε {0,1}, a ≠ b

– δ(q1, a, a) = {(q1, ε)}, a ε {0,1}

– δ(q1, ε, Z0) = {(q1, ε)} // empty stack

Instantaneous Description (ID)

• An ID is a 3-tuple (q, α, β) describing the state of a PDA

– q is the current state of the NFA

– α is the remaining input string

– β is the contents of the stack

Transition

• (q, α, β) |- (q’, α’, β’)

– Useful for simulating moves of a PDA on an input string

• Example:

(q0, 1001, Z0) |- (q0, 001, 1Z0) |- (q1, 01, 01Z0) |- (q1, 1, 1Z0) |- (q1, ε, Z0) |- (q1, ε, ε)

Language accepted by a PDA

• L = {w | (q0, w, Z0) |-* (q, ε, ε) , where q is some state in Q}

Context-free grammars

• An example:

11

00

SS

SS

S

Productions generate the language L = {wwr | w in {0,1}*}

Context-free grammars

• S is a variable, 0 and 1 are terminal symbols

• A string in L is derived by starting with the symbol S and making a sequence of substitutions , replacing a variable by the right hand side of a production

Formal definition

• A context-free grammar is described by a 4-tuple: (V, T, P, S)

– V is the set of variables of the grammar

– T is the set of terminals

– P is the set of productions

– S is a special symbol in V, called the start symbol

– Productions are of the form A -> α, where α is a string over {V U T}* and A is variable in V

Derivations

• Define a relation on {V U T}* thus

– Let α and β be arbitrary strings over {V U T}* and

A -> γ be a production in P

– Then α A β α γ β

– Closure of is denoted by *

Example of a derivation

• S 1S1 10S01 1001

Language generated

• Let G be a context-free grammar

• L(G) = {w ε T* | S * w}

Design a context-free grammar

• L = {w | w ε {a,b}* and is not of the form zz}

• Claim: – The grammar

S -> AB|BA|C

A -> aAb | bAa | aAa | bAb

B -> aBb | bBa | aBa | bBb

A -> a

B -> b

C -> aCb | bCa | aCa | bCb | a | b generates L

A derivation in G

• S AB aAbB aaAabB aaAabbBb aaaabbbb

Another problem

• Design a context-free grammar for the language over {0,1} that consists of the set S

of all strings with twice as many 0’s as 1’s

Claim

• The following grammar generates exactly the strings with this property

010|100|001|100|001|010|| SSSSSSSSS

Claim (2)

• L(G) is contained in S

– In any derivation every application of a production other than S -> ε introduces two 0’s and a single 1.

– Since the property is vacuously true for an empty string, the derived string retains this property whenever the production S -> ε is used

– A formal inductive argument on the number of steps in a derivation can be easily given

Claim (3)

• S is contained in L(G)

– Any of the sequences 001, 010, 100 can be treated as a balanced pair of parentheses

– A string with the above property has an adjacent pair of 00’s when the length is more than 3

Claim (3)

• Completing an inductive argument – Assume inductively that every sequence of length 3n

(n >1) corresponds to a balanced sequence of parentheses

– Consider a sequence of length 3n + 3

– We can remove a sequence 001 or 100 from this sequence

– The residual sequence corresponds to a balanced sequence of parentheses and into this we can reinsert 001 or 100, each of which corresponds to a balanced parentheses pair

Example 20

• Design a cfg that generates the language L= { 0i1j | 2i = 3j+1, j = 1, 3, 5, …}

• Set j = 2k+1, k=0, 1, …

• 2i = 6k+4 or i =3k+2

• The strings are of the form: 03k+212k+1 , which can be written as 02 (03)k (12)k1

Example 20

• Grammar:

– S -> 00B1 ; B -> 000B11|ε

Canonical derivations

• In a derivation, productions can be applied in an arbitrary order

• In a leftmost (rightmost) derivation, we always replace the leftmost (rightmost) variable by its body in a production

Parse Tree

• Any derivation can be represented by a parse tree

Ambiguous grammar

• A cfg G is ambiguous

– if there exists more than one parse tree for a string in L(G)

• In terms of canonical derivations:

– more than one leftmost or rightmost derivation

Decision algorithm

• Is it decidable if a cfg G is ambiguous ?

• We need considerable infrastructure to answer this question

An ambiguous grammar G

• The productions in G are:

– E -> E + E | E * E | (E)|I

– I -> Ia | Ib | I0 | I1 | a | b | 0 | 1

• In G:

– a + b* a has two leftmost derivations (see courseware)

Disambiguating the grammar G

• The precedence of the operators * and + needs to be defined

• The new productions that take care of this are:

– E -> E + T | T

– T -> T * F | F

– F -> (E) | I

– I -> I0 | I1 | Ia | Ib| a | b | 0 | 1

Normal forms

• Chomsky Normal Form (CNF, for short)

– All productions are of the form:

• A -> BC or A -> a

– Neither B nor C can be the start symbol, S

CFG to PDA

• Let G = (V, T, P, S) be a cfg

• P = ({q}, T, V U T, δ, q, S) is a PDA that accepts L(G) by empty stack for

– δ() defined thus:

• For each variable A in V , – δ(q, ε, A ) = { (q, β) | A -> β is a production in G}

• For each terminal symbol a, – δ(q, a, a) = {(q, ε)}

PDA to CFG (1)

• Given PDA, A = (Q, ∑, Λ, δ, q0 , Z0)

• A CFG, G = (V, T, P, S) is constructed thus:

– V = S U {[pXq] | p, q ε Q, X ε Λ}

– The set of productions P includes

• S -> [q0Z0p] for every p ε Q

– Further, if (r, Y1Y2..Yk) ε δ(q, a, X), where a ε ∑ or

a = ε and k ≥ 0, then P includes the production

• [qXrk] -> a[rY1r1][r1Y2r2]…[rk-1Ykrk]

PDA to CFG (2)

• When k = 0, the production is [qXr] -> a

• See Example 22 in courseware

Deterministic PDA (DPDA) (1)

• For each q, a, X,

– δ(q, a, X) is of size at most 1

• When δ(q, a, X) is not empty,

– δ(q, ε, X) is empty

Deterministic PDA (DPDA) (2)

• Acceptance by empty stack and final state are not equivalent

• Equivalent under an additional condition is satisfied

Prefix language and DPDA

• L is a prefix language

– if for a pair of strings x and y in L, neither x nor y is a prefix of the other

• Theorem

– A DPDA accepts L by empty stack iff L is a prefix language that is accepted by some DPDA by final state1

DPDA language

• Languages accepted by DPDAs by final state are called DPDA languages

• Example of a DPDA language:

– Lwcwr = {wcwr | w in {0,1}*}

DPDA and CFLs (1)

• DPDA languages lie strictly between regular languages and context-free languages

• Given a regular language L, we can construct a DPDA that simulates the action of a DFA that accepts L simply by ignoring the stack

• Since the language Lwcwr is not regular, the

inclusion is strict (c is some fixed symbol in ∑)

DPDA and CFLs (2)

• DPDA languages are strictly included in the class of context free languages

– Example: Lwwr (note that this is not a prefix

language)

DPDA languages and ambiguity

• A language accepted by a DPDA (by final state or empty stack) has an unambiguous grammar

• However, not every language that has an unambiguous grammar is accepted by a DPDA

– Example: Lwwr

60-354, Theory of Computation Fall 2013asishm.myweb.cs.uwindsor.ca/cs354/F13/ch4.pdf · 60-354,...

Documents

Transcript of 60-354, Theory of Computation Fall 2013asishm.myweb.cs.uwindsor.ca/cs354/F13/ch4.pdf · 60-354,...