Automata and Rational Numbersshallit/Talks/allou2.pdf · 2013. 5. 3. · Automatic sets over N A...
Transcript of Automata and Rational Numbersshallit/Talks/allou2.pdf · 2013. 5. 3. · Automatic sets over N A...
Automata and Rational Numbers
Jeffrey ShallitSchool of Computer Science
University of WaterlooWaterloo, Ontario N2L 3G1
http://www.cs.uwaterloo.ca/~shallit
1 / 40
Representations of integers
◮ Σk = {0, 1, . . . , k − 1}◮ Numbers are represented in base k using digits in Σk
◮ So numbers are represented by words in Σ∗k
◮ Canonical representation of n denoted (n)k , without leadingzeroes
◮ Set of all canonical representations of integers is
Ck = Σ∗k \ 0Σ∗
k
◮ If w ∈ Σ∗k then [w ]k is the integer represented by w
◮ Sometimes useful to use least-significant-digit-firstrepresentation; sometimes most-significant-digit-first.
2 / 40
Automatic sets over N
◮ A k-automatic set of (non-negative) integers A corresponds toa regular (rational) subset of Σ∗
k \ 0Σ∗k
◮ Example:t = 0110100110010110 · · ·
◮ Let T be the positions of the 1’s in t:
T = {1, 2, 4, 7, 8, 11, 13, 14, . . .},
i.e., those integers with an odd number of 1’s in their base-2representation
◮ T is accepted by a DFA reading numbers expressed in base 2
3 / 40
The Thue-Morse automaton
0 01
1
0 1
4 / 40
Closure properties
The class of k-automatic sets is
◮ Closed under complement
◮ interchange “finality” of states
◮ Closed under union, intersection:
◮ use “direct product” construction
◮ Closed under the sum operationA+ B = {a+ b : a ∈ A, b ∈ B}.
◮ On input n, “guess” a and b, add and check if equal to n
◮ Closed under multiplication by constants
5 / 40
Decidability
Given a k-automatic set A ⊆ N, can we decide properties of A?
When we say “given a k-automatic set A”, we really mean given aDFA M accepting A.
◮ Given A and n, we can decide if n ∈ A
◮ We can easily decide if A = ∅ (use DFS)
◮ We can easily decide if |A| < ∞ (look for useful cycles)
6 / 40
Representing rational numbers
◮ Represent rational number α = p/q by pair of integers (p, q),represented in base k ; pad shorter with leading zeroes
◮ So representations of rationals are over the alphabet Σk × Σk
◮ For example, if w = [3, 0][5, 0][2, 4][6, 1] then[w ]10 = (3526, 41).
◮ Define quok(x) = [π1(x)]k/[π2(x)]k , where πi is theprojection onto the i ’th coordinate
◮ So quok(w) = 3526/41 = 86.
◮ Canonical representations lack leading [0, 0]’s
◮ Every rational has infinitely many canonical representations,e.g., as (1, 2), (2, 4), (3, 6), . . ., etc.
7 / 40
Automatic sets over Q≥0
◮ quok(L) =⋃
x∈L{quok(x)}◮ A ⊆ Q≥0 is a k-automatic set of rationals if A = quok(L) for
some regular language L ⊆ (Σk × Σk)∗.
8 / 40
Examples
Example 1. Let k = 2, B = {[0, 0], [0, 1], [1, 0], [1, 1]}, andconsider
L1 := B∗{[0, 1], [1, 1]}B∗.
Then L1 consists of all pairs of integers where the secondcomponent has at least one nonzero digit — the point being toavoid division by 0. Then quok(L) = Q≥0, the set of allnon-negative rational numbers.
Example 2. Consider
L2 = {w ∈ (Σ2k)
∗ : π1(w) ∈ 0∗Ck and π2(w) ∈ 0∗1}.
Then quok(L2) = N.
9 / 40
Examples
Example 3. Let k = 3, and consider the language
L3 := [0, 1]{[0, 0], [2, 0]}∗.
Then quok(L3) is the 3-adic Cantor set, the set of all rationalnumbers in the “middle-thirds” Cantor set with denominators apower of 3.
Example 4. Let k = 2, and consider
L4 := [0, 1]{[0, 0], [0, 1]}∗{[1, 0], [1, 1]}.
Then the numerator encodes the integer 1, while the denominatorencodes all positive integers that start with 1. Hence
quok(L4) = {1n
: n ≥ 1}.
10 / 40
Examples
Example 5. Let k = 4, and consider
S := {0, 1, 3, 4, 5, 11, 12, 13, . . .}
of all non-negative integers that can be represented using only thedigits 0, 1,−1 in base 4. Consider the language
L5 = {(p, q)4 : p, q ∈ S}.
It is not hard to see that L5 is (Q, 4)-automatic.The main result in Loxton & van der Poorten [1987] can berephrased as follows: quo4(L5) contains every odd integer.In fact, an integer t is in quo4(L5) if and only if the exponent ofthe largest power of 2 dividing t is even.
11 / 40
Examples
Example 6. Consider
L6 = {w ∈ (Σ2k)
∗ : π2(w) ∈ 0∗1+0∗}.
An easy exercise using the Fermat-Euler theorem shows that thatquok(L6) = Q≥0.
12 / 40
Examples
Example 7. For a word x and letter a let |x |a denote the numberof occurrences of a in x . Consider the regular language
L7 = {w ∈ (Σ22) : |π1(w)|1 is even and |π2(w)|1 is odd}.
Then it follows from a result of Schmid [1984] that
quo2(L7) = Q≥0 − {2n : n ∈ Z}.
13 / 40
Intersection
Note that quok(L1 ∪ L2) = quok(L1) ∪ quok(L2) but theanalogous identity involving intersection need not hold.
Example 8. Consider L1 = {[2, 1]} and L2 = {[4, 2]}. Thenquo10(L1 ∩ L2) = ∅ 6= {2} = quo10(L1) ∩ quo10(L2).
14 / 40
Not the same as previous models of automatic rationals
– e.g., Boigelot-Brusten-Bruyere or Adamczewski-Bell
Example 9. Define
S = {(km − 1)/(kn − 1) : 1 ≤ m < n};
this is easily seen to be a k-automatic set of rationals.
However, the set of its base-k expansions is of the form
⋃
0<m<n<∞
0.(0n−m (k − 1)m)ω,
where by xω we mean the infinite word xxx · · · .
A simple argument using the pumping lemma shows that no Buchiautomaton can accept this language.
15 / 40
Don’t demand lowest terms
Why not consider only representations p/q with gcd(p, q) = 1?
Two problems:
- the language of all such representations
{w ∈ (Σ2k)
∗ : gcd(π1(w), π2(w)) = 1}
is not even context-free
- given a DFA accepting (S)k , where S ⊆ N× N, can we decide ifgcd(p, q) = 1 for all (p, q) ∈ S? Not known to be decidable!
16 / 40
Don’t demand unique or finite number of representations
No known way to represent Q≥0 as a regular language with everyrational represented only a finite number of times.
17 / 40
Closure properties
Automatic sets of rationals are closed under
◮ union;
◮ S → S + α for α ∈ Q≥0
◮ S → S .− α for α ∈ Q≥0;
◮ S → α .− S for α ∈ Q≥0;
◮ S → αS for α ∈ Q≥0;
◮ S → { 1x: x ∈ S \ {0}}.
18 / 40
Basic decidability properties
Given a DFA M accepting a language L representing a set ofrationals S , can decide
◮ if S = ∅◮ given α ∈ Q≥0, whether there exists x ∈ S with x = α (resp.,
x < α, x ≤ α, x > α, x ≥ α, x 6= α, etc.)
◮ if |S | = ∞◮ given a finite set F ⊆ Q≥0, if F ⊆ S or if S ⊆ F
◮ given α ∈ Q≥0, if α is an accumulation point of S
19 / 40
Deciding if S ⊆ N
Important concept: let S ⊆ N \ {0}. We say S is k-finite if thereexist
◮ an integer n ≥ 0,
◮ n positive integers g1, g2, . . . , gn and
◮ n ultimately periodic sets W1,W2, . . . ,Wn ⊆ N such that
S =⋃
1≤i≤n
{gik j : j ∈ Wi}.
Theorem. Suppose there is a finite set of prime numbers D suchthat each element of S ⊆ N is factorable into a product of powersof elements of D.
Then S is k-automatic if and only if S is k-finite.
20 / 40
Deciding if S ⊆ N
Furthermore, there is an algorithm that, given the DFA M
accepting (S)k , will determine if S is k-finite and if so, willproduce the decomposition
S =⋃
1≤i≤n
{gik j : j ∈ Wi}.
(Use reversed representations; follow path from start state on 0’s;from each such state there can only be finitely many paths to anaccepting state.)
21 / 40
Deciding if S ⊆ N
Suppose M is a DFA with n states accepting L representing a setof rationals S .
Case 1: If α ∈ S and α is “small”, say α ≤ kn, then we canintersect S with [0, kn], remove all representations of integers0, 1, . . . , kn, and see if any word is left. If so, then S is not asubset of N.
22 / 40
Deciding if S ⊆ N
Case 2: If α ∈ S and α is “big”, say α > kn, then the numeratorcan be pumped, but the denominator stays the same. So thedenominator must divide [uvw ]k − [uw ]k . But this isk |w |([uv ]k − [u]k), and |uv | ≤ n, so each denominator must dividea bounded number, times a power of k . So the set of all primefactors of all denominators is finite.
So the projection onto the set of denominators is k-finite.
We can easily remove powers of k . For the remaining finite set ofdenominators we can check if each denominator divides thenumerator.
23 / 40
supA is rational or infinite
Given a DFA M accepting L ⊆ (Σk × Σk)∗ representing a set of
rationals A ⊆ Q≥0, what can we say about supA?
Theorem. supA is rational or infinite.
Proof ideas: quok(uviw) forms a monotonic sequence. Defining
γ(u, v) :=[π1(uv)]k − [π1(u)]k[π2(uv)]k − [π2(u)]k
one of the following three cases must hold:
(i) quok(uw) < quok(uvw) < quok(uv2w) < · · · < U ;
(ii) quok(uw) = quok(uvw) = quok(uv2w) = · · · = U ;
(iii) quok(uw) > quok(uvw) > quok(uv2w) > · · · > U .
Furthermore, limi→∞ quok(uviw) = U.
24 / 40
supA is rational or infinite
It follows that if supA is finite, and the DFA M has n states, thensupA = maxT , where
T = T1 ∪ T2
and
T1 = {quok(x) : |x | < n and x ∈ L};T2 = {γ(u, v) : |uv | ≤ n, |v | ≥ 1, δ(q0, u) = δ(q0, uv),
and there exists w such that uvw ∈ L}.
25 / 40
supA is computable
We know that supA lies in the finite computable set T .
For each of t ∈ T , we can check to see if t ≥ supA by checking ifA ∩ (t,∞) is empty.
Then supA is the least such t.
26 / 40
Applications 1: Critical exponent
We say a word w is a p/q power if we can write
w =
n︷ ︸︸ ︷xx · · · x x ′
where
◮ n = ⌊p/q⌋;◮ x ′ is prefix of x ; and
◮ p/q = n + |x ′|/|x |.
For example, the French word entente is a 73 -power.
The exponent of a finite word w is defined to be the largestrational number α such that w is an α power.
Given an infinite word w, its critical exponent is defined to be thesupremum, over all finite factors x of w, of exp(x).
27 / 40
Critical exponents
Examples of critical exponent:
- the Thue-Morse word
t = 0110100110010110 · · ·
has critical exponent 2 (Thue)
- the Fibonacci word
f = 01001010 · · ·
has critical exponent (5 +√5)/2 (Mignosi & Pirillo)
- Previously known to be computable for fixed points of uniformmorphisms (Krieger)
28 / 40
Applications 1: Computing critical exponent
Theorem. If w is a k-automatic sequence, then its criticalexponent is rational or infinite. Furthermore, it is computable fromthe DFAO M generating w .
Proof sketch. Given M, we can transform it into anotherautomaton M ′ accepting
{(m, n) : there exists i ≥ 0 such that w[i ..i+m−1] has period n}.
We then apply our algorithm for computing sup(quok(L)) toL(M ′).
29 / 40
The Leech word
Leech [1957] showed that the fixed point l of the morphism
0 → 0121021201210
1 → 1202102012021
2 → 2010210120102
is squarefree.
We used our method to compute the critical exponent of thisword. It is 15/8.
Furthermore, if x is a 15/8-power occurring in l, then |x | = 15 · 13ifor some i ≥ 0.
30 / 40
Applications 2: Diophantine exponent
The Diophantine exponent of an infinite word w is defined to bethe supremum of the real numbers β for which there existarbitrarily long prefixes of w that can be expressed as uv e for finitewords u, v and rationals e such that |uv e |/|uv | ≥ β.
(concept due to Adamczewski, Bugeaud)
Theorem. The Diophantine exponent of a k-automatic sequenceis either rational or infinite. Furthermore, it is computable.
31 / 40
Applications 3: Linear recurrence
A sequence a is said to be recurrent if every factor that occurs in a
occurs infinitely often.
It is linearly recurrent if consecutive occurrences of factors oflength ℓ appear at distance bounded by Cℓ, for some constant Cindependent of ℓ.
For example, the Thue-Morse word
t = 0110100110010110 · · ·is linearly recurrent, but the Barbier infinite word
b = 110111001011101111000 · · ·is recurrent but not linearly recurrent.
Theorem. It is decidable if a given k-automatic sequence islinearly recurrent. If so, the optimal constant of linear recurrence iscomputable.
32 / 40
Recurrence constant for the Rudin-Shapiro sequence
The Rudin-Shapiro sequence
0001001000011101 · · ·
counts the parity of the number of 11’s occurring in the binaryrepresentation of n.
We used our method to compute the recurrence constant for thissequence; it is 41.
33 / 40
How about lim sup?
I don’t know how to compute lim sup(quok(L)) in general, whereL ⊆ (Σk × Σk)
∗ is a regular language.
However, the largest special point can be computed.
A real number β is a special point if there exists an infinitesequence (xj)j≥1 of distinct words of L such thatlimj→∞ quok(xj) = β.
Thus, a special point is either an accumulation point of quok(L),or a rational number with infinitely many distinct representations.
Luckily, special points suffice for most applications involvinglim sup.
34 / 40
Application 4: Goldstein quotients
Let x be an infinite word and ρx(n) its subword complexity, thenumber of distinct factors of length n.
I. Goldstein [2011] showed that in some cases the quantities
lim supn≥1
ρx(n)
nand lim inf
n≥1
ρx(n)
n
are computable.
We can show that these are computable when x is a k-automaticsequence.
We can construct a DFA accepting
L := {(ρx(n), n)k : n ≥ 1}
and then find the largest (resp., smallest) special point. Thiscorresponds to the lim sup (resp., lim inf).
35 / 40
Open Problems 1
Are the following problems decidable?
Given a DFA M accepting a language L representing a set of pairsof integers S ⊆ N× N:
◮ does there exist a pair (p, q) ∈ S with p | q?◮ does there exist infinitely many pairs (p, q) ∈ S with p | q?
36 / 40
Open Problems 2
Is there a regular language L ⊆ (Σk × Σk)∗ representing Q≥0 such
that each rational has only finitely many representations?
37 / 40
Open Problem 3: Cobham’s theorem
Prove or disprove: let S ⊆ Q≥0 be a set of non-negative rationalnumbers. Then S is simultaneously k-automatic and ℓ-automatic,for multiplicatively independent integers k , ℓ ≥ 2, if and only iffthere exists a semilinear set A ⊆ N2 such thatS = {p/q : [p, q] ∈ A}.
38 / 40
For Further Reading
1. Luke Schaeffer and Jeffrey Shallit, The critical exponent iscomputable for automatic sequences, Int. J. Found. Comput.
Sci. 23 (2012), 1611–1626.
2. Eric Rowland and Jeffrey Shallit, k-automatic sets of rationalnumbers, Proc. LATA 2012, Lect. Notes in ComputerScience, Vol. 7183, pp. 490–501.
39 / 40
Thanks, JPA, for 28 years of collaboration!
40 / 40