Δ - Clearing Restarting Automata and CFL

57
Δ-Clearing Restarting Automata and CFL Peter Černo and František Mráz

description

Δ - Clearing Restarting Automata and CFL. Peter Černo and František Mráz. Introduction. Δ -Clearing Restarting Automata : Restricted model of Restarting Automata . In one step (based on a limited context ): Delete a substring , R eplace a substring by Δ . The main result : - PowerPoint PPT Presentation

Transcript of Δ - Clearing Restarting Automata and CFL

- Clearing Restarting Automata and CFL

-Clearing Restarting Automata and CFLPeter erno and Frantiek MrzIntroduction-Clearing Restarting Automata:Restricted model of Restarting Automata.In one step (based on a limited context):Delete a substring,Replace a substring by .The main result: -clearing restarting automata recognize all context-free languages.

Example

Restarting AutomataRestarting Automata:Tool for modeling some techniques for natural language processing.Analysis by Reduction: Method for checking [non-]correctness of a sentence.Iterative application of simplifications.Until the input cannot be simplified anymore.Organization-clearing restarting automata.*-clearing restarting automata.*-clearing restarting automata recognize CFL.Special coding.Reduction: *- to -clearing restarting automata.

-Clearing Restarting AutomataLet k be a positive integer.k--clearing restarting automaton (k-cl-RA ) Is a couple M = (, I) : input alphabet, , $, , working alphabet, = {}I finite set of instructions (x, z t, y) :x {, }.*, |x|k(left context)y *.{, $}, |y|k(right context)z +, t {, }. and $ sentinels.

Rewritinguzv M utv iff = (x, z t, y) I :x is a suffix of .u and y is a prefix of v.$ .

L(M) = {w * | w *M }.LC (M) = {w * | w *M }.

Empty WordNote: For every cl-RA M: *M hence L(M).Whenever we say that cl-RA M recognizes a language L, we always mean that L(M) = L {}.

Example 1L1 = {anbn | n > 0} {} :1-cl-RA M = ({a, b}, I) ,Instructions I are:R1 = (a, ab , b) ,R2 = (, ab , $) .

Note:We did not use .

Example 2L2 = {ancbn | n > 0} {} : 1-cl-RA M = ({a, b, c}, I) ,Instructions I are:R1 = (a, c , b) ,R2 = (a, ab , b) , R3 = (, ab , $) .

Note: We must use .

Organization-clearing restarting automata.*-clearing restarting automata.*-clearing restarting automata recognize CFL.Special coding.Reduction: *- to -clearing restarting automata.

*-Clearing Restarting Automata*-clearing restarting automata Similar to -clearing restarting automata.We allow instructions (x, z k, y), where k |z|.

Organization-clearing restarting automata.*-clearing restarting automata.*-clearing restarting automata recognize CFL.Special coding.Reduction: *- to -clearing restarting automata.

*cl-RA and CFLTheorem: For each context-free language L there exists a 1-*cl-RA M recognizing L.Idea.M works in a bottom-up manner.If the input is small, M may clear the whole input.If the input is long, M may replace some subword by the code of nonterminal.*cl-RA and CFLIf the input is small:

*cl-RA and CFLIf the input is long:

*cl-RA and CFLProof. L context-free language over .G = (VN , VT , S, P) context-free grammar : G is in: Chomsky normal form,G generates: L(G) = L {} ,Nonterminals: VN = {N1 , N2 , , Nm } ,Terminals: VT = , = {} ,Start: S = N1 ,

, $, VN VT .*cl-RA and CFLProof (Continued).Auxiliary G = (VN , VT , S, P) obtained from G :By adding symbol to VT , VT = VT {} = ,By adding productions Ni aib to P ,P = P { Ni aib | i = 1, , m; a, b VT } .

Our goal: 1-*cl-RA M : LC (M) = L(G) {} .Then: L(M) = LC (M) * = L(G) {}.*cl-RA and CFLProof (Continued).aib code for Ni ( a, b VT ).a, b VT separators (between codes).

*cl-RA and CFLProof (Continued).Idea:If z can be derived from Ni (in G ), Then M can replace z by a code for Ni .M replaces only the inner part of z by i .M leaves first and last letter of z as separator.*cl-RA and CFLIdea:

*cl-RA and CFLProof (Continued).Two problems:|Inner part| |i |.Finite many instructions.Proposition: For any w L(G) :If |w| > c = |VN | + 2 , then w = x z y :c < |z| 2c , S * x Ni y * x z y for some Ni .*cl-RA and CFLProof (Continued).Construction: I1 set of all instructions:

(, w , $)

Where w L(G) and |w| c .This resolves the small inputs.*cl-RA and CFLProof (Continued).Construction: For every Ni * z1 zs , where c < s 2c :

(z1 , z2 zs-1 i , zs )

I2 set of all such instructions.I1 , I2 finite sets of instructions. M = (, I1 I2 ) required automaton. Q.E.D. GeneralizationWe can choose:

GeneralizationObservation:For every t 1 m1 , k : z contains v t empty space.

Trivial ReductionWhy empty space?Trivial simulation:

Trivial ReductionWhy empty space?Partial -instructions:1 = (x, z1 , z2 z3 zs y) ,2 = (x , z2 , z3 zs y) ,r = (x r-1, zr zs , y) .

Problem:The equivalence is not guaranteed.Avoiding ConflictsHow to avoid conflicts? We can encode some extra information into z.

Organization-clearing restarting automata.*-clearing restarting automata.*-clearing restarting automata recognize CFL.Special coding.Reduction: *- to -clearing restarting automata.

CodingWe require:Coding by means of -clearing restarting automata.Ability to recover the original word at any time.Alice and BobConsider the following game:

Alice and Bob

Alice and Bob

Alice and Bob

ProtocolWe assume: fixed alphabet ,fixed length of all initial messages given to Alice.

Is there any such protocol?Yes. Basic intuition: Alice adds information by choosing a position of .Alice loses information by deleting one letter.CodingIdea: Length of | messages | = || .For = {a, b, c} :

Example 3-Letter AlphabetPerfect matching for = {a, b, c} :

aaa aabaa bacaa caaab abbab bbcab caaac aabac bacac acaba babba bbcba cbabb abbbb bbcbb cbabc abbbc bccbc ccaca aabca cacca ccacb acbcb bcccb cbacc acbcc bcccc cc38Coding Encoding ExampleConsider word w over = {a, b, c} :w = accbabccacaabbcabcbcacaa .Let us factorize w into groups of || = 3 letters:w = acc | bab | cca | caa | bbc | abc | bca | caa .We want to encode information i into w :i = 11001000 .39Coding Encoding Examplei = 1 1 0 0 1 0 0 0 :w = acc | bab | cca | caa | bbc | abc | bca | caa ,w' = ac | bb | cca | caa | bc | abc | bca | caa .aaa aabaa bacaa caaab abbab bbcab caaac aabac bacac acaba babba bbcba cbabb abbbb bbcbb cbabc abbbc bccbc ccaca aabca cacca ccacb acbcb bcccb cbacc acbcc bcccc cc40Coding Major DrawbackThe major drawback:If w does not start with left sentinel then we cannot factorize w into groups of || letters.

Word w can be factorized as:acc | bab | cca | caa | bbc | abc | bca | caa ,ac | cba | bcc | aca | abb | cab | cbc | aca | a ,a | ccb | abc | cac | aab | bca | bcb | cac | aa .Coding Fixed PointsSimple trick:To factorize w we need some fixed point.The left sentinel is one example.In the first phase we distribute fixed points throughout the whole input tape.Coding Fixed PointsSuppose that we have:w = abaccaccbabccacaabbcabcbcacaa .The symbol in w is our fixed point:w = abacc | acc | bab | cca | caa | bbc | abc | bca | caa .Now we can place the next fixed point:w = abacc | acc | bab | cca | caa | bbc | abc | bca | ca .Organization-clearing restarting automata.*-clearing restarting automata.*-clearing restarting automata recognize CFL.Special coding.Reduction: *- to -clearing restarting automata.

Algorithmic ViewpointImagine a -clearing restarting automaton as a nondeterministic machine N, which repeatedly executes the following two steps:

Choosing Step: N chooses a subword w of the input u$ , |w| K . (K is a fixed constant)Solving Step: N runs a computation on w, which either rejects, or replaces a subword of w by or .

N accepts u iff it can clear the whole word u.

Algorithmic ViewpointIllustration:

Algorithmic ViewpointTo define any -clearing restarting automaton: Define the solving algorithm S, called the solver. Show the existence of a suitable limit K.We put no resource limits on the solving algorithm.Idea of the AlgorithmConsider *cl-RA M whose [generalized] construction was based on a context-free grammar G in ChNF.We want the solving algorithm S imitating M.We do not preserve the original representation of M.Idea of the AlgorithmFirst, we distribute fixed points throughout the whole input tape in approximately equal distances:

Idea of the AlgorithmSuppose that w already contains fixed points.We [internally] translate symbols occurring in w. We find an instruction = (x, z r, y) of the original *cl-RA M applicable inside w.If there is no such instruction, reject.Idea of the AlgorithmSuppose = (x, z r, y) is applicable inside w.

Idea of the Algorithm = (x, z r, y) is applicable inside wOur goal: replace z by r.To avoid conflicts we encode information into z.z contains a long enough empty space v * .v may be interrupted by fixed points.Space between fixed points is long enough.We choose one such space working space.We reserve this space reference point .Idea of the AlgorithmIllustration: = (x, z r, y)

Idea of the AlgorithmWorking space: = (x, z r, y)

Idea of the AlgorithmCleaning: = (x, z r, y)

Conclusion

Referenceserno, P., Mrz, F.: Clearing restarting automata. Fundamenta Informaticae 104(1), 17 54 (2010)erno, P., Mrz, F.: Delta-clearing restarting automata and CFL. Tech. rep., Charles University, Faculty of Mathematics and Physics, Prague (2011)http://popelka.ms.mff.cuni.cz/cerno/structure_and_recognition/