The validity of weighted automata - perso.telecom-paristech.fr ·...

The validity of weighted automata

Jacques Sakarovitch

CNRS / Universite Denis-Diderot and Telecom ParisTech

Joint work with

Sylvain Lombardy

LaBRI, CNRS / Universite de Bordeaux / Institut Polytechnique de Bordeaux

CIAA, 31 July 2018, Charlottetown, PEI

Dedicated to the memory of Zoltan Esik

First version presented at CIAA 2012 under the title:

The removal of weighted ε-transitions,

in: Proc. CIAA 2012, Lect. Notes in Comput. Sci. n◦ 7381.

Published in

International Journal of Algebra and Computation 23 (2013)

DOI: 10.1142/S0218196713400146

Supported by ANR Project 10-INTB-0203 VAUCANSON 2.

The automaton model

p qb

a

b

a

b

B1

The automaton model

p qb

a

b

a

b

B1 L(B1) = A∗bA∗

−→ pb−−→p

a−−→ pb−−→ q −→

The weighted automaton model

p q

12 b

12 a

12 b

a

b

C1


p q

12 b

12 a

12 b

a

b

C1

1−−→ p12b−−−→p

12a−−−→ p

12b−−−→ q

1−−→1−−→ p

12b−−−→q

a−−→ qb−−→ q

1−−→


p q

12 b

12 a

12 b

a

b

C1

1−−→ p12b−−−→p

12a−−−→ p

12b−−−→ q

1−−→1−−→ p

12b−−−→q

a−−→ qb−−→ q

1−−→

� Weight of a path c : product of the weights of transitions in c

� Weight of a word w : sum of the weights of paths with label w

bab �−→ 1

2+

1

8=

5

8


p q

12 b

12 a

12 b

a

b

C1

1−−→ p12b−−−→p

12a−−−→ p

12b−−−→ q

1−−→1−−→ p

12b−−−→q

a−−→ qb−−→ q

1−−→



bab �−→ 1

2+

1

8=

5

8= <0.101>2


p q

12 b

12 a

12 b

a

b

C1 C1 ∈ Q〈〈A∗〉〉

1−−→ p12b−−−→p

12a−−−→ p

12b−−−→ q

1−−→1−−→ p

12b−−−→q

a−−→ qb−−→ q

1−−→



bab �−→ 1

2+

1

8=

5

8C1 : A∗ −→ Q


p q

12 b

12 a

12 b

a

b

C1 C1 ∈ Q〈〈A∗〉〉

1−−→ p12b−−−→p

12a−−−→ p

12b−−−→ q

1−−→1−−→ p

12b−−−→q

a−−→ qb−−→ q

1−−→



C1 =1

2b+

1

4ab+

1

2b a+

3

4b b+

1

8aab+

1

4ab a+

3

8ab b+

1

2b aa+ . . .


p q

12 b

12 a

12 b

a

b

C1

C1 =⟨I1,E1,T1

⟩=

⟨(1 0

),

(12 a +

12 b

12 b

0 a+ b

),

(01

)⟩


A = 〈 I ,E ,T 〉 E = adjacency matrix

E p,q =∑

{wl(e) | e transition from p to q}= linear combination of letters in A

E np,q =

∑{wl(c) | c computation from p to q of length n}

E ∗ =∑n∈N

E n

Since E is proper, E ∗ is well-defined

E ∗p,q =

∑{wl(c) | c computation from p to q}


p q

12 b

12 a

12 b

a

b

C1

C1 =⟨I1,E1,T1

⟩=

⟨(1 0

),

(12 a +

12 b

12 b

0 a+ b

),

(01

)⟩

C1 = I1 · E1∗ · T1


p q

12 b

12 a

12 b

a

b

C1

C1 =⟨I1,E1,T1

⟩=

⟨(1 0

),

(12 a +

12 b

12 b

0 a+ b

),

(01

)⟩

C1 = I1 · E1∗ · T1

Every K-automaton defines a series in K〈〈A∗〉〉whose coefficients are effectively computable


p q

12 b

12 a

12 b

a

b

C1

C1 = I1 · E1∗ · T1


Where is the problem ?


p q

12 b

12 a

12 b

a

b

C1

C1 = I1 · E1∗ · T1


Where is the problem ?

We want to be able to deal with weighted automata

where transitions might be labelled by the empty word

The need for a richer model: eg, the concatenation product

p qa

a

r s

b

b


p qa

a

r s

b

b

pa

a

s

b

b


p qa

a

r s

b

b

pa

a

s

b

b

p qa

a

r s

b

b

ε


p qa

a

r s

b

b

pa

a

s

b

b

p qa

a

r s

b

b

ε

p qa

a

r s

b

b

b

A basic result in (classical) automata theory

Theorem (Folk–Lore)

Every ε-NFA is equivalent to an NFA




Usefulness of ε-transitions:

Preliminary step for many constructions on NFA’s:

� Product and star of position (Glushkov, standard) automata

� Thompson construction

� Construction of the universal automaton

� Computation of the image of a transducer

� ...

May correspond to the structure of the computations




Usefulness of ε-transitions:

Preliminary step for many constructions on NFA’s:

� Product and star of position (Glushkov, standard) automata

� Thompson construction

� Construction of the universal automaton

� Computation of the image of a transducer

� ...

May correspond to the structure of the computations

Removal of ε-transitions is implemented in all automata software


p rqε a


p rqε a

p rq

a

a


p rqε a

p rq

a

a

p rq

a

a


p rqε a

p rq

a

a

p rq

a

a

p r

a


p rqε a

p rqa

ε

ε

p rq

a

a

p rq

a

a

p r

a


p rqε a

p rqa

ε

ε

p rq

a

ap rq

aε

ε

p rq

a

a

p r

a


p rqε a

p rqa

ε

ε

p rq

a

ap rq

aε

ε

p rq

a

ap r

aε

p r

a


p rqε a

p rqa

ε

ε

p rq

a

ap rq

aε

ε

p rq

a

ap r

aε

p r

a

p ra




A proof

A = 〈 I ,E ,T 〉 E transition matrix of AEntries of E = subsets of A ∪ {ε}

L(A) = I · E ∗ · TE = E 0 + Ep

L(A) = I · (E 0 + E p)∗ · T = I · (E ∗

0 · Ep)∗ · E ∗

0 · TA = 〈 I ,E ,T 〉 equivalent to B =

⟨I ,E ∗

0 · Ep,E∗0 · T

⟩




A proof

A = 〈 I ,E ,T 〉 E transition matrix of AEntries of E = subsets of A ∪ {ε}

L(A) = I · E ∗ · TE = E 0 + Ep

L(A) = I · (E 0 + E p)∗ · T = I · (E ∗

0 · Ep)∗ · E ∗

0 · TA = 〈 I ,E ,T 〉 equivalent to B =

⟨I ,E ∗

0 · Ep,E∗0 · T

⟩

One proof = several algorithms for computing E ∗0 or E ∗

0 · Ep

Automata and expressions

E2 = (a∗ + b∗)∗


E2 = (a∗ + b∗)∗

a

b

The Thompson automaton of E2


E2 = (a∗ + b∗)∗

a

b

The Thompson automaton of E2

Theorem (Folk–Lore ?)

The closure of the Thompson automaton of Eyields the position automaton of E

A basic question in weighted automata theory

QuestionIs every ε-WFA is equivalent to a WFA?



p rq2ε 3a



p r

2

q

6a

3a



p r

2

6a



p r

2

6a

p ra

q

2ε

ε



p r

2

6a

p ra

q

2ε

2ε



p r

2

6a

p ra

2ε

2



p r

2

6a

p ra

2ε

2

1−−→ pa−−→ r

1−−→ ,



p r

2

6a

p ra

2ε

2

1−−→ pa−−→ r

1−−→ ,1−−→ p

2 ε−−−→ pa−−→ r

1−−→ ,



p r

2

6a

p ra

2ε

2

1−−→ pa−−→ r

1−−→ ,1−−→ p

2 ε−−−→ pa−−→ r

1−−→ ,

1−−→ p2 ε−−−→ p

2 ε−−−→ pa−−→ r

1−−→ , . . .



p r

2

6a

p ra

2ε

2

1−−→ pa−−→ r

1−−→ ,1−−→ p

2 ε−−−→ pa−−→ r

1−−→ ,

1−−→ p2 ε−−−→ p

2 ε−−−→ pa−−→ r

1−−→ , . . .

a �−→ 1 + 2 + 4 + · · ·



p r

2

6a

p ra

2ε

2

1−−→ pa−−→ r

1−−→ ,1−−→ p

2 ε−−−→ pa−−→ r

1−−→ ,

1−−→ p2 ε−−−→ p

2 ε−−−→ pa−−→ r

1−−→ , . . .

a �−→ 1 + 2 + 4 + · · · undefined



p r

2

6a

p ra

ε

1



p r

2

6a

p ra

ε

1

1−−→ pa−−→ r

1−−→ ,1−−→ p

ε−−→ pa−−→ r

1−−→ ,

1−−→ pε−−→ p

ε−−→ pa−−→ r

1−−→ , . . .

a �−→ 1 + 1 + 1 + · · · undefined



p r

2

6a

p ra

12 ε

12



p r

2

6a

p ra

12 ε

12

1−−→ pa−−→ r

1−−→ ,1−−→ p

12ε−−−→ p

a−−→ r1−−→ ,

1−−→ p12ε−−−→ p

12ε−−−→ p

a−−→ r1−−→ , . . .



p r

2

6a

p ra

12 ε

12

1−−→ pa−−→ r

1−−→ ,1−−→ p

12ε−−−→ p

a−−→ r1−−→ ,

1−−→ p12ε−−−→ p

12ε−−−→ p

a−−→ r1−−→ , . . .

a �−→ 1 + 12 +

14 + · · · undefined?



p r

2

6a

p ra

12 ε

12

1−−→ pa−−→ r

1−−→ ,1−−→ p

12ε−−−→ p

a−−→ r1−−→ ,

1−−→ p12ε−−−→ p

12ε−−−→ p

a−−→ r1−−→ , . . .

a �−→ 1 + 12 +

14 + · · · undefined? defined?



p r

2

6a

p r2a

1

1−−→ pa−−→ r

1−−→ ,1−−→ p

12ε−−−→ p

a−−→ r1−−→ ,

1−−→ p12ε−−−→ p

12ε−−−→ p

a−−→ r1−−→ , . . .

a �−→ 1 + 12 +

14 + · · · undefined? defined?



p r

2

6a

p r

k εa

1



p r

2

6a

p rk∗a

k∗

if k∗ =∞∑n=0

kn is defined in K



certainly not !



certainly not !

New questions

Which ε-WFAs have a well-defined behaviour?



certainly not !

New questions


How to compute the behaviour of an ε-WFA (when it is well-defined )?



certainly not !

New questions


How to compute the behaviour of an ε-WFA (when it is well-defined )?

How to decide if the behaviour of an ε-WFA is well-defined?

Behaviour of weighted automata


A = 〈K,A,Q, I ,E ,T 〉 possibly with ε-transitions



u ∈ A∗



u ∈ A∗ possibly infinitely many paths labelled by u in A



u ∈ A∗ possibly infinitely many paths labelled by u in A<A , u> sum of weights of computations labelled by u in A




if it is defined!




if it is defined!

A is defined if <A , u> is defined ∀u ∈ A∗




if it is defined!


Trivial case




if it is defined!


Trivial case

Every u in A∗ is the label of a finite number of paths




if it is defined!


Trivial case


⇑no ε-transitions in A




if it is defined!


Trivial case


no circuits of ε-transitions in A




if it is defined!


Trivial case


no circuits of ε-transitions in A

acyclic K-automata


First solution

behaviour well-defined ⇐⇒ acyclic


First solution


Legitimate, as far as the behaviours of the automata are concerned

(Kuich–Salomaa 86, Berstel–Reutenauer 84-88;11)


First solution


Legitimate, as far as the behaviours of the automata are concerned

(Kuich–Salomaa 86, Berstel–Reutenauer 84-88;11)

p ra

12 ε

12

not valid


A not acyclic ⇒ weight of u in A may be an infinite sum.



Second family of solutions

Accepting the idea of infinite sums





First point of view (algebraico-logic)

� Definition of a new operator for infinite sums∑

I

� Setting axioms on∑

I

such that the star of a matrix be meaningful







I


I


Less a definition on automata than conditions on K

for all K-automata have well-defined behaviour







I


I


Less a definition on automata than conditions on K

for all K-automata have well-defined behaviour

Works of Bloom, Esik, Kuich (90’s –)based on the axiomatisation described by Conway (72)


Second point of view (more analytical)

Infinite sums are given a meaning via a topology on K




Topology allows to define summable families in K




Topology on K defines a topology on K〈〈A∗〉〉





Third solution (Lombardy, S. 03 –)


PA set of all paths in A







PA set of all paths in AA well-defined ⇐⇒ WL

(PA

)summable







PA set of all paths in AA well-defined ⇐⇒ ∀p, q ∈ Q WL

(PA(p, q)

)summable








(PA(p, q)

)summable

� Yields a consistent theory








(PA(p, q)

)summable

� Yields a consistent theory

� Two pitfalls for effectivity� effective computation of a summable family may not be possible� effective computation may give values to non summable families

Problems in computing the behaviour of a weighted automaton

−11

−1

1

A1


−11

−1

1

A1

A1 =⟨I1,E1,T1

⟩=

⟨(1 0

),(

1 1−1 −1

),(10

)⟩

A1 = I1 · E1∗ · T1

E12 = 0 =⇒ E1

∗ =(

2 1−1 0

)=⇒ A1 = 2


A2A2

−121

2

12

−12


A2A2

(−12

)∗= 2

3

−121

2

12

−12


A2A223

13

12

−12


A2A2

(−12

)∗= 2

323

13

12

−12


A2A223

13

13


A2A223

19


A2A2

(19

)∗= 9

823

19


A2A234


A2A234

A2 =⟨I2,E2,T2

⟩=

⟨(1 0

),(−1

212

12 −1

2

),(10

)⟩

A2 = I2 · E2∗ · T2

E23 = E2 =⇒ E2

∗ undefined =⇒ A2 undefined


p

1

A3


p

1

A3 (1)∗ = undefined

N natural integers A3 not defined


p

1

A3 (1)∗ = +∞


N N ∪ +∞ compact topology A3 defined


p

+∞

A3




p

1




N∞ N ∪ +∞ discrete topology A3 not defined


p q

1+∞

1

1

A4



p q

1+∞

1

1

A4


N∞ N ∪ +∞ discrete topology A4 defined


p q

1+∞

1

1



N∞ N ∪ +∞ discrete topology A4 defined

A chicken and egg problem

automaton algorithm

A A


automaton algorithm

A A

valid ? success ?


automaton algorithm

A A

valid ? success ?

valid


automaton algorithm

A A

valid ? success ?

valid =⇒ success


automaton algorithm

A A

valid ? success ?

valid =⇒ success

success


automaton algorithm

A A

valid ? success ?

valid =⇒ success

valid?⇐= success

A new definition of validity for weighted automata


E ∗ free monoid generated by E

PA set of paths in A (local) rational subset of E ∗





DefinitionR rational family of paths of A R ∈ RatE ∗ ∧ R ⊆ PA





DefinitionR rational family of paths of A R ∈ RatE ∗ ∧ R ⊆ PA

DefinitionA is valid iff

∀R rational family of paths of A, WL(R) is summable


Validity implies the well-definition of behaviour



The notion of validity settles the previous examples



The notion of validity settles the previous examples

RemarkIf every subfamily of a summable family in K is summable,

then validity is equivalent to the well-definition of behaviour

Eg. R , C (and N , Z , N ).

If every rational subfamily of a summable family in K is summable,then validity is equivalent to the well-definition of behaviour

Eg. Q .


TheoremA is valid iff the behaviour of every covering of A is well-defined



TheoremIf A is valid, then ‘every’ removal algorithm on A is successful



TheoremIf A is valid, then ‘every’ removal algorithm on A is successful

Nota BeneWe do not know yet how to decide whether

a Q - or an R -automaton is valid.

Deciding validity

Straightforward cases

� Non starable semirings (eg. N, Z)A valid ⇐⇒ A acyclic

� Complete topological semirings (eg. N ) every A valid

� Rationally additive semirings (eg. RatA∗ ) every A valid

� Locally closed commutative semirings every A valid

Deciding validity

DefinitionK topological, ordered, positive, star-domain downward closed

(TOP SDDC)

Deciding validity


(TOP SDDC)

N, N , Q+, R+, Zmin, RatA∗,... are TOP SDDC

N∞, (binary) positive decimals,... are not TOP SDDC

Deciding validity


(TOP SDDC)

N, N , Q+, R+, Zmin, RatA∗,... are TOP SDDC

N∞, (binary) positive decimals,... are not TOP SDDC

TheoremK topological, ordered, positive, star-domain downward closedA K-automaton is valid if and only if

the ε-removal algorithm succeeds

Deciding validity

DefinitionIf A is a Q- or R-automaton,

then abs(A) is a Q+- or R+-automaton

Deciding validity

DefinitionIf A is a Q- or R-automaton,

then abs(A) is a Q+- or R+-automaton

TheoremA Q- or R-automaton A is valid if and only if abs(A) is valid.

Automata and expressions validity

‘Kleene’ theorem

Automata ⇐⇒ Expressions

A ⇐⇒ E

Weighted automata ⇐⇒ Weighted expressions


‘Kleene’ theorem

Automata ⇐⇒ Expressions

A ⇐⇒ E

Weighted automata ⇐⇒ Weighted expressions

Validity of expressions

E valid ⇐⇒ c(E) well-defined

c(E) computed by a bottom-up traversal of the syntactic tree of E


Valid A yields valid E

Valid E yields valid A with Glushkov construction

Valid E may yield non valid A with Thompson construction


Valid A yields valid E

Valid E yields valid A with Glushkov construction

Valid E may yield non valid A with Thompson construction

a |1

b |1

1 11

1 11

1 1

1

11

1

1

1

1

−1

1

The Thompson automaton of (a∗ + {−1}b∗)∗

Hidden parts

� The removal algorithm itself

� Details on the topology we put semirings

� Validity of automata and covering

� ‘Infinitary’ axioms : strong, star-strong semirings

� Links with the ‘axiomatic’ approach (Bloom–Esik–Kuich)

� References to previous work (on removal algorithms):

Conclusion

Conclusion

� Semiring structure is weak, topology does not help so much.

Conclusion


� This weakness imposes a restricted definition of validity,in order to guarantee success of validity algorithms.

Conclusion



� Axiomatic approach does not allowto deal wit most common numerical semirings: Zmin, Q

Conclusion



� Axiomatic approach does not allowto deal wit most common numerical semirings: Zmin, Q

� On ‘usual’ semirings,the new definition of validity coincides with the former one.

Conclusion (2)

Conclusion (2)

� Apart the trivial cases, and the TOP SDDC case,decision of validity is never granted, and is to be established.

Conclusion (2)


� On ‘usual’ semirings, validity is decidable.

Conclusion (2)



� The new definition of validityfills the ‘effectivity gap’ left open by the former one.

Conclusion (2)




� The algorithms implemented in Awaliare given a theoretical framework.

Conclusion (2)




� The algorithms implemented in Awaliare given a theoretical framework.

All’s well, that ends well!

Hidden parts


Hidden parts

� The removal algorithm itself:

� Termination issues (weighted versus Boolean cases)� Complexity issues

Hidden parts



(e) (f )

Boolean ε-removal procedure does not terminate

if newly created ε-transitions are stored in a stack

Hidden parts



(1)

(2)

(3)

(4)

(5)(6)

(7)(8)

weighted ε-removal procedure does not terminate

if newly created ε-transitions are stored in a queue

Hidden parts



Hidden parts



DefinitionK topological: K regular Hausdorff ⊕, ⊗ continuous

Hidden parts




Definition{ti}i∈I summable of sum t :

∀V ∈ N(t) , ∃JV finite , JV ⊂ I , ∀L finite , JV ⊆ L ⊂ I∑i∈L

ti ∈ V .

Hidden parts




Definition{ti}i∈I summable of sum t :

∀V ∈ N(t) , ∃JV finite , JV ⊂ I , ∀L finite , JV ⊆ L ⊂ I∑i∈L

ti ∈ V .

Lemma (Associativity)

{ti}i∈I summable of sum t ,I =

⋃j∈J Kj ∀j ∈ J {ti}i∈Kj

summable of sum sj ,then {sj}j∈J summable of sum t

Hidden parts




Validity of automata and covering

y

A5

S ⊂ N2×2, x =(1 00 1

)= 1S, y =

(0 11 0

), x+y = ∞S =

(1 11 1

)

S equipped with the discrete topology

0S, y , and ∞S starable x = y2 x not starable


∞S

A5

S ⊂ N2×2, x =(1 00 1

)= 1S, y =

(0 11 0

), x+y = ∞S =

(1 11 1

)




y

A5 p q B5

y

y

S ⊂ N2×2, x =(1 00 1

)= 1S, y =

(0 11 0

), x+y = ∞S =

(1 11 1

)




y

A5 p B5

∞S

x

S ⊂ N2×2, x =(1 00 1

)= 1S, y =

(0 11 0

), x+y = ∞S =

(1 11 1

)



Hidden parts





Hidden parts





DefinitionA topological semiring is a strong semiring

if the product of two summable families is a summable family

Hidden parts







TheoremK strong semiring s ∈ K〈〈A∗〉〉 starable iff s0 ∈ K starable

Hidden parts







TheoremK strong semiring s ∈ K〈〈A∗〉〉 starable iff s0 ∈ K starable

Proposition (Madore 18)

There exist (semi)rings K that are not strong

Hidden parts





DefinitionA topological semiring is a star-strong semiring ifthe star of a summable family, whose sum is starable, is summable

Hidden parts






Proposition

A strong semiring K is starable and star-strong iffevery rational family of K is summable

Hidden parts






Proposition

A strong semiring K is starable and star-strong iffevery rational family of K is summable

Conjecture

A starable strong semiring star-strong

Hidden parts






Hidden parts





� Links with the ‘axiomatic’ approach (Bloom–Esik–Kuich):

TheoremA starable star-strong semiring is an iteration semiring

Group identities

i 0

1

2

x0

1K

x1

1K

x21K

i

x0 + x1 + x2

0

1

2

0, 0

1, 0

0, 1

1, 1

2, 11, 2

2, 2

0, 2

2, 0

1K

x01K

1K

x1

x2

1K

1Kx0

1Kx2

x1

1K

1K

1K

x0

x1

x2

0

1

2

x0

x1

x2x2

x0

x1

x1

x2

x0

ε-removalε-removal

covering

Hidden parts







Hidden parts





� Links with the ‘axiomatic’ approach (Bloom–Esik–Kuich):


� locally closed srgs (Esik–Kuich), k-closed srgs (Mohri)� links with other algorithms:

shortest-distance algorithm (Mohri),state-elimination method (Hanneforth–Higueira)

The validity of weighted automata - perso.telecom-paristech.fr ·...

Documents

Transcript of The validity of weighted automata - perso.telecom-paristech.fr ·...