Automatic Subsets of Rational Numbersshallit/Talks/nijm.pdf · 2015. 11. 4. · Automatic sets over...

Post on 12-Sep-2020

1 views 0 download

Transcript of Automatic Subsets of Rational Numbersshallit/Talks/nijm.pdf · 2015. 11. 4. · Automatic sets over...

Automatic Subsets of Rational Numbers

Jeffrey ShallitSchool of Computer Science

University of WaterlooWaterloo, Ontario N2L 3G1

Canadashallit@cs.uwaterloo.ca

https://www.cs.uwaterloo.ca/~shallit

1 / 40

Representations of integers

◮ Σk = {0, 1, . . . , k − 1}◮ Numbers are represented in base k using digits in Σk

◮ So numbers are represented by words in Σ∗k

◮ Canonical representation of n denoted (n)k , without leadingzeroes

◮ Set of all canonical representations of integers is

Ck = Σ∗k \ 0Σ∗

k

◮ If w ∈ Σ∗k then [w ]k is the integer represented by w

◮ Sometimes useful to use least-significant-digit-firstrepresentation; sometimes most-significant-digit-first.

2 / 40

Automatic sets over N

◮ A k-automatic set of (non-negative) integers A corresponds toa regular (rational) subset of Σ∗

k \ 0Σ∗k

◮ Example:t = 0110100110010110 · · ·

◮ Let T be the positions of the 1’s in t:

T = {1, 2, 4, 7, 8, 11, 13, 14, . . .},

i.e., those integers with an odd number of 1’s in their base-2representation

◮ T is accepted by a DFA reading numbers expressed in base 2

3 / 40

The Thue-Morse automaton

0 01

1

0 1

4 / 40

Closure properties

The class of k-automatic sets is

◮ Closed under complement

◮ interchange “finality” of states

◮ Closed under union, intersection:

◮ use “direct product” construction

◮ Closed under the sum operationA+ B = {a+ b : a ∈ A, b ∈ B}.

◮ On input n, “guess” a and b, add and check if equal to n

◮ Closed under multiplication by constants

5 / 40

Decidability

Given a k-automatic set A ⊆ N, can we decide properties of A?

When we say “given a k-automatic set A”, we really mean given aDFA M accepting A.

◮ Given A and n, we can decide if n ∈ A

◮ We can easily decide if A = ∅ (use DFS)

◮ We can easily decide if |A| < ∞ (look for useful cycles)

6 / 40

Representing rational numbers

◮ Represent rational number α = p/q by pair of integers (p, q),represented in base k ; pad shorter with leading zeroes

◮ So representations of rationals are over the alphabet Σk × Σk

◮ For example, if w = [3, 0][5, 0][2, 4][6, 1] then[w ]10 = (3526, 41).

◮ Define quok(x) = [π1(x)]k/[π2(x)]k , where πi is theprojection onto the i ’th coordinate

◮ So quo10(w) = 3526/41 = 86.

◮ Canonical representations lack leading [0, 0]’s

◮ Every rational has infinitely many canonical representations,e.g., as (1, 2), (2, 4), (3, 6), . . ., etc.

7 / 40

Automatic sets over Q≥0

◮ quok(L) =⋃

x∈L{quok(x)}◮ A ⊆ Q≥0 is a k-automatic set of rationals if A = quok(L) for

some regular language L ⊆ (Σk × Σk)∗.

8 / 40

Examples

Example 1. Let k = 2, B = {[0, 0], [0, 1], [1, 0], [1, 1]}, andconsider

L1 := B∗{[0, 1], [1, 1]}B∗.

Then L1 consists of all pairs of integers where the secondcomponent has at least one nonzero digit — the point being toavoid division by 0. Then quok(L) = Q≥0, the set of allnon-negative rational numbers.

Example 2. Consider

L2 = {w ∈ (Σ2k)

∗ : π1(w) ∈ 0∗Ck and π2(w) ∈ 0∗1}.

Then quok(L2) = N.

9 / 40

Examples

Example 3. Let k = 3, and consider the language

L3 := [0, 1]{[0, 0], [2, 0]}∗.

Then quok(L3) is the 3-adic Cantor set, the set of all rationalnumbers in the “middle-thirds” Cantor set with denominators apower of 3.

Example 4. Let k = 2, and consider

L4 := [0, 1]{[0, 0], [0, 1]}∗{[1, 0], [1, 1]}.

Then the numerator encodes the integer 1, while the denominatorencodes all positive integers that start with 1. Hence

quok(L4) = {1n

: n ≥ 1}.

10 / 40

Examples

Example 5. Let k = 4, and consider

S := {0, 1, 3, 4, 5, 11, 12, 13, . . .}

of all non-negative integers that can be represented using only thedigits 0, 1,−1 in base 4. Consider the language

L5 = {(p, q)4 : p, q ∈ S}.

It is not hard to see that L5 is (Q, 4)-automatic.The main result in Loxton & van der Poorten [1987] can berephrased as follows: quo4(L5) contains every odd integer.In fact, an integer t is in quo4(L5) if and only if the exponent ofthe largest power of 2 dividing t is even.

11 / 40

Examples

Example 6. Consider

L6 = {w ∈ (Σ2k)

∗ : π2(w) ∈ 0∗1+0∗}.

An easy exercise using the Fermat-Euler theorem shows that thatquok(L6) = Q≥0.

12 / 40

Examples

Example 7. For a word x and letter a let |x |a denote the numberof occurrences of a in x . Consider the regular language

L7 = {w ∈ (Σ22) : |π1(w)|1 is even and |π2(w)|1 is odd}.

Then it follows from a result of Schmid [1984] that

quo2(L7) = Q≥0 − {2n : n ∈ Z}.

13 / 40

Intersection

Note that quok(L1 ∪ L2) = quok(L1) ∪ quok(L2) but theanalogous identity involving intersection need not hold.

Example 8. Consider L1 = {[2, 1]} and L2 = {[4, 2]}. Thenquo10(L1 ∩ L2) = ∅ 6= {2} = quo10(L1) ∩ quo10(L2).

14 / 40

Not the same as previous models of automatic rationals

– e.g., Boigelot-Brusten-Bruyere or Adamczewski-Bell

Example 9. Define

S = {(km − 1)/(kn − 1) : 1 ≤ m < n};

this is easily seen to be a k-automatic set of rationals.

However, the set of its base-k expansions is of the form

0<m<n<∞

0.(0n−m (k − 1)m)ω,

where by xω we mean the infinite word xxx · · · .

A simple argument using the pumping lemma shows that no Buchiautomaton can accept this language.

15 / 40

Don’t demand lowest terms

Why not consider only representations p/q with gcd(p, q) = 1?

Two problems:

- the language of all such representations

{w ∈ (Σ2k)

∗ : gcd([π1(w)]k , [π2(w)]k) = 1}

is not even context-free

- given a DFA accepting (S)k , where S ⊆ N× N, can we decide ifgcd(p, q) = 1 for all (p, q) ∈ S? Not known to be decidable!

16 / 40

Don’t demand unique or finite number of representations

No known way to represent Q≥0 as a regular language with everyrational represented only a finite number of times.

17 / 40

Closure properties

Automatic sets of rationals are closed under

◮ union;

◮ S → S + α for α ∈ Q≥0

◮ S → S .− α for α ∈ Q≥0;

◮ S → α .− S for α ∈ Q≥0;

◮ S → αS for α ∈ Q≥0;

◮ S → { 1x: x ∈ S \ {0}}.

18 / 40

Basic decidability properties

Given a DFA M accepting a language L representing a set ofrationals S , can decide

◮ if S = ∅◮ given α ∈ Q≥0, whether there exists x ∈ S with x = α (resp.,

x < α, x ≤ α, x > α, x ≥ α, x 6= α, etc.)

◮ if |S | = ∞◮ given a finite set F ⊆ Q≥0, if F ⊆ S or if S ⊆ F

◮ given α ∈ Q≥0, if α is an accumulation point of S

19 / 40

Deciding if S ⊆ N

Important concept: let S ⊆ N \ {0}. We say S is k-finite if thereexist

◮ an integer n ≥ 0,

◮ n positive integers g1, g2, . . . , gn and

◮ n ultimately periodic sets W1,W2, . . . ,Wn ⊆ N such that

S =⋃

1≤i≤n

{gik j : j ∈ Wi}.

Theorem. Suppose there is a finite set of prime numbers D suchthat each element of S ⊆ N is factorable into a product of powersof elements of D.

Then S is k-automatic if and only if S is k-finite.

20 / 40

Deciding if S ⊆ N

Furthermore, there is an algorithm that, given the DFA M

accepting (S)k , will determine if S is k-finite and if so, willproduce the decomposition

S =⋃

1≤i≤n

{gik j : j ∈ Wi}.

(Use reversed representations; follow path from start state on 0’s;from each such state there can only be finitely many paths to anaccepting state.)

21 / 40

Deciding if S ⊆ N

Suppose M is a DFA with n states accepting L representing a setof rationals S .

Case 1: If α ∈ S and α is “small”, say α ≤ kn, then we canintersect S with [0, kn], remove all representations of integers0, 1, . . . , kn, and see if any word is left. If so, then S is not asubset of N.

22 / 40

Deciding if S ⊆ N

Case 2: If α ∈ S and α is “big”, say α > kn, then the numeratorcan be pumped, but the denominator stays the same. So thedenominator must divide [uvw ]k − [uw ]k . But this isk |w |([uv ]k − [u]k), and |uv | ≤ n, so each denominator must dividea bounded number, times a power of k . So the set of all primefactors of all denominators is finite.

So the projection onto the set of denominators is k-finite.

We can easily remove powers of k . For the remaining finite set ofdenominators we can check if each denominator divides thenumerator.

23 / 40

supA is rational or infinite

Given a DFA M accepting L ⊆ (Σk × Σk)∗ representing a set of

rationals A ⊆ Q≥0, what can we say about supA?

Theorem. supA is rational or infinite.

Proof ideas: quok(uviw) forms a monotonic sequence. Defining

γ(u, v) :=[π1(uv)]k − [π1(u)]k[π2(uv)]k − [π2(u)]k

one of the following three cases must hold:

(i) quok(uw) < quok(uvw) < quok(uv2w) < · · · < U ;

(ii) quok(uw) = quok(uvw) = quok(uv2w) = · · · = U ;

(iii) quok(uw) > quok(uvw) > quok(uv2w) > · · · > U .

Furthermore, limi→∞ quok(uviw) = U.

24 / 40

supA is rational or infinite

It follows that if supA is finite, and the DFA M has n states, thensupA = maxT , where

T = T1 ∪ T2

and

T1 = {quok(x) : |x | < n and x ∈ L};T2 = {γ(u, v) : |uv | ≤ n, |v | ≥ 1, δ(q0, u) = δ(q0, uv),

and there exists w such that uvw ∈ L}.

25 / 40

supA is computable

We know that supA lies in the finite computable set T .

For each of t ∈ T , we can check to see if t ≥ supA by checking ifA ∩ (t,∞) is empty.

Then supA is the least such t.

26 / 40

Applications 1: Critical exponent

We say a word w is a p/q power if we can write

w =

n︷ ︸︸ ︷xx · · · x x ′

where

◮ n = ⌊p/q⌋;◮ x ′ is prefix of x ; and

◮ p/q = n + |x ′|/|x |.

For example, the Dutch word koekoek is a 73 -power.

The exponent of a finite word w is defined to be the largestrational number α such that w is an α power.

Given an infinite word w, its critical exponent is defined to be thesupremum, over all finite factors x of w, of exp(x).

27 / 40

Critical exponents

Examples of critical exponent:

- the Thue-Morse word

t = 0110100110010110 · · ·

has critical exponent 2 (Thue)

- the Fibonacci word

f = 01001010 · · ·

has critical exponent (5 +√5)/2 (Mignosi & Pirillo)

- Previously known to be computable for fixed points of uniformmorphisms (Krieger)

28 / 40

Applications 1: Computing critical exponent

Theorem. If w is a k-automatic sequence, then its criticalexponent is rational or infinite. Furthermore, it is computable fromthe DFAO M generating w .

Proof sketch. Given M, we can transform it into anotherautomaton M ′ accepting

{(m, n) : there exists i ≥ 0 such that w[i ..i+m−1] has period n}.

We then apply our algorithm for computing sup(quok(L)) toL(M ′).

29 / 40

The Leech word

Leech [1957] showed that the fixed point l of the morphism

0 → 0121021201210

1 → 1202102012021

2 → 2010210120102

is squarefree.

We used our method to compute the critical exponent of thisword. It is 15/8.

Furthermore, if x is a 15/8-power occurring in l, then |x | = 15 · 13ifor some i ≥ 0.

30 / 40

Applications 2: Diophantine exponent

The Diophantine exponent of an infinite word w is defined to bethe supremum of the real numbers β for which there existarbitrarily long prefixes of w that can be expressed as uv e for finitewords u, v and rationals e such that |uv e |/|uv | ≥ β.

(concept due to Adamczewski, Bugeaud)

Theorem. The Diophantine exponent of a k-automatic sequenceis either rational or infinite. Furthermore, it is computable.

31 / 40

Applications 3: Linear recurrence

A sequence a is said to be recurrent if every factor that occurs in a

occurs infinitely often.

It is linearly recurrent if consecutive occurrences of factors oflength ℓ appear at distance bounded by Cℓ, for some constant Cindependent of ℓ.

For example, the Thue-Morse word

t = 0110100110010110 · · ·is linearly recurrent, but the Barbier infinite word

b = 110111001011101111000 · · ·is recurrent but not linearly recurrent.

Theorem. It is decidable if a given k-automatic sequence islinearly recurrent. If so, the optimal constant of linear recurrence iscomputable.

32 / 40

Recurrence constant for the Rudin-Shapiro sequence

The Rudin-Shapiro sequence

0001001000011101 · · ·

counts the parity of the number of 11’s occurring in the binaryrepresentation of n.

We used our method to compute the recurrence constant for thissequence; it is 41.

33 / 40

How about lim sup?

I don’t know how to compute lim sup(quok(L)) in general, whereL ⊆ (Σk × Σk)

∗ is a regular language.

However, the largest special point can be computed.

A real number β is a special point if there exists an infinitesequence (xj)j≥1 of distinct words of L such thatlimj→∞ quok(xj) = β.

Thus, a special point is either an accumulation point of quok(L),or a rational number with infinitely many distinct representations.

Luckily, special points suffice for most applications involvinglim sup.

34 / 40

Application 4: Goldstein quotients

Let x be an infinite word and ρx(n) its subword complexity, thenumber of distinct factors of length n.

I. Goldstein [2011] showed that in some cases the quantities

lim supn≥1

ρx(n)

nand lim inf

n≥1

ρx(n)

n

are computable.

We can show that these are computable when x is a k-automaticsequence.

We can construct a DFA accepting

L := {(ρx(n), n)k : n ≥ 1}

and then find the largest (resp., smallest) special point. Thiscorresponds to the lim sup (resp., lim inf).

35 / 40

An unsolvability result

Very recently Jorg Endrullis and JOS proved that the followingproblem is recursively unsolvable: given a k-automatic set ofrational numbers S , is there some power of k in S?

36 / 40

Open Problems 1

Are the following problems decidable?

Given a DFA M accepting a language L representing a set of pairsof integers S ⊆ N× N:

◮ does there exist a pair (p, q) ∈ S with p | q?◮ does there exist infinitely many pairs (p, q) ∈ S with p | q?

37 / 40

Open Problems 2

Is there a regular language L ⊆ (Σk × Σk)∗ representing Q≥0 such

that each rational has only finitely many representations?

38 / 40

Open Problem 3: Cobham’s theorem

Prove or disprove: let S ⊆ Q≥0 be a set of non-negative rationalnumbers. Then S is simultaneously k-automatic and ℓ-automatic,for multiplicatively independent integers k , ℓ ≥ 2, if and only iffthere exists a semilinear set A ⊆ N2 such thatS = {p/q : [p, q] ∈ A}.

39 / 40

For Further Reading

1. Luke Schaeffer and Jeffrey Shallit, The critical exponent iscomputable for automatic sequences, Int. J. Found. Comput.

Sci. 23 (2012), 1611–1626.

2. Eric Rowland and Jeffrey Shallit, k-automatic sets of rationalnumbers, Int. J. Found. Comput. Sci. 26 (2015), 343–365.

40 / 40