Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes...

44
Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14

Transcript of Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes...

Page 1: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Minimization of Symbolic Automata

Presented By:

Loris D’Antoni

Joint work with:

Margus Veanes

01/24/14, POPL14

Page 2: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

2

What is automata minimization?

Page 3: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Deterministic Finite Automaton

3

a

b

q0 q

ab

A = (Q,q0,F,δ,Σ)

Page 4: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

4

Automata Minimization

Minimization = find and collapse equivalent states

p

q

s

s

Non final

Final

distinguishable

Page 5: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

0

1 2

3 4 6

55

6

a

b a

a,ba,b

b

a

b

a

b

a

b

0 1,3 2,4 65,6a,b a,ba,b a,b

Page 6: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

A simple Application: Random Password generation

Given constraints:• Length is k: "^.{5,20}$"• Contains 2 capital letters: "[A-Z].*[A-Z]"• Contains a digit: "\d“Generate random instances with uniform distribution that match all the above conditions.

6

Page 7: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Key idea

^.{5,20}$

[A-Z].*[A-Z]

\d

7

Page 8: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Problems

8

Big automaton Minimization

Big alphabet 216 characters

in UTF16

Symbolic Automata

Page 9: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Symbolic Finite Automaton (SFA)

9

λx. x mod 2=0

λx. x mod 2=1

q0 q

λx. x mod 2=0λx. x mod 2=1

A = (Q,q0,F,δ,σ) Input sort: in this case int

Separate theory for the input

alphabetSMT SOLVER

Page 10: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Symbolic Finite Automata (SFA)

10

λx. x mod 2=0

λx. x mod 2=1

p q

λx. x mod 2 =0λx. x mod 2=1

1 2 5 3

p p q p p

p is final accept the input

Exe

cuti

on

E

xam

ple

Page 11: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

11

Advantages of Symbolic Automata

• Alphabet is represented symbolically– UTF16 abstracted using BDDs – Integer using predicates over integers

• Succinctness– at most n2 transitions– One transition captures many symbols

• BUT: do DFA algorithms generalize to SFAs?

Page 12: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

An example: SFA intersection

12

p1 q1

1

p2 q2

2

A1:

A2:

p1

p2

12A1A2:q1

q2X

delete when 12 unsatisfiable

REQUIREMENTS:Input theory must be a Boolean algebra, and

decidable

Page 13: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

13

Moore’s algorithm

p

q

p’

q’

distinguishable

a

a

distinguishable

n2 iterations over k symbolsO(kn2)

s

s

Page 14: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

14

Symbolic Moore’s algorithm

Initially D = F x (Q\F) U (Q\F) x Ffor each (p’,q’) in D, (p,q) not in Dlet φ,ψ guards of δ(p,p’), δ(q,q’)

if(isSat(φ ∧ ψ))add (p,q) to D

p

q

p’

q’

distinguishable

φ

ψ

distinguishable φ ∧ ψ satisfiable

m transitionsO(m2 f(k))

k = size of biggest predicate in SFA

Page 15: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

15

Sometimes Moore is LessFrom: Rani Abdellatif Sent: Tuesday, November 13, 2012 12:55 PM To: Margus Veanes Cc: Patrick McFalls Subject: RE: Password generation help Margus, I tested the perf of the sample you sent me with password lengths from 8 to 15 chars and here are the results:

Chars Time ms 8 171 9 406

10 1061 11 2044 12 3698 13 6271 14 11591 15 18362

This time is the time it takes to run sfa.Determinize(rex.Solver).Minimize(rex.Solver). The time required to create the SFA or generate samples once it’s created is quite small in comparison. We are expecting 15 characters to be on the shorter end of password we’ll generate, going up to 128 characters.

18 sec for 15 characters!

the culprit

should scale up to 128

characters!

Page 16: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

16

Hopcroft’s algorithm: intuition

FQ\F

Page 17: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

17

Hopcroft’s algorithm: intuition

a

a

a

RA

S

Page 18: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

18

Hopcroft’s algorithm: intuition

P3

P2P1 P4

R

Keep partitioning with respect to Wfor every input symbol

b

b

Page 19: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

19

Hopcroft’s algorithm: intuition

R

Let’s assume I already split according to R

P2

P1

Page 20: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

20

Hopcroft’s algorithm: intuition

RQ

Let’s assume I already split according to R

P2

P1

Do I need to consider both P1 and for P2 future splitting?

Page 21: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

21

Hopcroft’s algorithm: intuition

a

a

a

RQ

Let’s assume I already split according to R

P2

P1

Do I need to consider both P1 and for P2 future splitting?

Page 22: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

22

Hopcroft’s algorithm: intuition

a

aa

RQ

Let’s assume I already split according to R

P2

P1

Do I need to consider both P1 and for P2 future splitting?

Page 23: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

23

Hopcroft’s algorithm: intuition

a

a

a

RQ

Let’s assume I already split according to R

P2

P1

Do I need to consider both P1 and for P2 future splitting?

NO I ONLY NEED ONE!

Page 24: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

24

Hopcroft’s algorithm

P := {F, Q\F}W := {if |F|< |Q\F| then F else Q\F}while W != { }

R:=pickFrom(W)foreach a in Σ

S := δ-1(R,a)

while ∃ T ∈ P. T∩S ≠ {} ∧ T \S ≠ {}P,W := split(P, P∩S , P\S)

return partitioned DFA

log n iterationsO(kn log n)

Page 25: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Hopcroft’s algorithm example

0

1 2

3 4 6

55

6

a

b a

a,ba,b

P2P1

b

a

b

a

b

a

b

R

PARTITION: {P1, P2}

TO ANALYZE: {P2}

Page 26: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Hopcroft’s algorithm example

0

1 2

3 4 6

55

6

a

b a

a,ba,b

b

a

b

a

b

a

b

RP2P11 P12

PARTITION: {P11, P12, P2}

TO ANALYZE: {P2, P12}

Page 27: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Hopcroft’s algorithm example

0

1 2

3 4 6

55

6

a

b a

a,ba,b

b

a

b

a

b

a

b

R P2P11 P12

PARTITION: {P11, P12, P2}

TO ANALYZE: {P12}

Page 28: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Hopcroft’s algorithm example

0

1 2

3 4 6

55

6

a

b a

a,ba,b

b

a

b

a

b

a

b

0 1,3 2,4 65,6a,b a,ba,b a,b

Page 29: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

29

Symbolic Hopcroft’s algorithm

P := {F, Q\F}W := {if |F|< |Q\F| then F else Q\F}while W != { }

R:=pickFrom(W)foreach a in Σ

S := δ-1(R,a)

while ∃ T ∈ P. T∩S ≠ {} ∧ T \S ≠ {}P,W := split(P, P∩S , P\S)

return partitioned DFA

Alphabet might not be finite

Page 30: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

30

Finitize the alphabet

φ1 φ2

φ3φ‘7

φ'3

φ‘1

φ‘4

φ‘2

φ‘5

φ‘6

φ‘8

Predicates:{x>5, x<10, x=3}

Minterms:{x=3, x≤5, 5<x<10, x≥10}

Page 31: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

31

Symbolic Hopcroft’s algorithm

P := {F, Q\F}W := {if |F|< |Q\F| then F else Q\F}while W ≠ {}

R:=pickFrom(W)foreach φ in Minterms(A)

S := δ-1(R, φ)

while ∃ T ∈ P. T∩S ≠ {} ∧ T \S ≠

{}P,W := split(P, P∩S , P\S)

return partitioned DFA

log n iterationsO(2mnlog n+2mf(mk))

We need something better

Page 32: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

32

New Algorithm: Intuition

Φ

ψ

A R

P1

P2

p

q What if Φ ≠ ψ?

Φ\ψ

Page 33: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Example 1/2

0

1 2

3 4 6

55

6

x<0

x≥0

-2<x<5

-5<x<3-2<x<5

-5<x<3

truetrue

FQ\F

false ≠ -5<x<3

R

Page 34: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Example 1/2

0

1 2

3 4 6

55

6

x<0

x≥0

-2<x<5

-5<x<3-2<x<5

-5<x<3

truetrue

R

Page 35: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Example 2/2

r65p q true

x<2

x<5

x≥2

x≥5

Both p and q go to r, but…

x≥2 x≥5 ?? NO

Then p is distinguishable from q

R

Page 36: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Example 2/2

r65p q true

x<2

x<5

x≥2

x≥5

Both p and q go to r, but…

x≥2 x≥5 ?? NO

Then p is distinguishable from q

R

Page 37: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

37

New Algorithm

P := {F, Q\F}W := {if |F|< |Q\F| then F else Q\F}while W ≠ { }

R := pickFrom(W); S := δ-1(R, true);while ∃ A ∈ P. A∩S ≠ {} ∧ ∃p1,p2. δ-1(p1) ≠ δ-1(p2)

P,W := split(P, P∩S , P\S, witness(δ-1(p1) ≠ δ-

1(p2))

return partitioned DFA

log n iterationsO(n2log n f(nk))

Page 38: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

Experiments

1. Randomly generated DFAs SFAs using BDDs (sort = bitvec 7 bits)

2. SFAs generated from regexesSFAs using BDDs (sort = bitvec 16 bits)

3. A corner case of Minterm generationSFAs using BDDs (sort = bitvec 20 bits)

4. Randomly generated SFAs over string x intSFAs over using Z3 (sort = string x int)

5. Monadic second order logic to DFA transformationSFAs using BDDs (sort = bitvec 40 bits)

Page 39: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

1) Randomly generated DFAs5 billion DFAs: 10 to 100 states, 2 to 50 symbols From [Almeida, Moreira, Reis, TR05]

Page 40: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

2) SFAs generated from regexes (regexplib.com)

3000 regexes over UTF16 alphabet (216 elems)From [regexplib.com]

Both axis logscale

More States =>Moore Worse

Page 41: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

3) A corner case of Minterm generation

This SFA has 2k minterms!!

brics.automata.dkUses intervals instead

of BDDs

Logscale

Page 42: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

4) Randomly generated SFAs over string x int

Randomly generated 10 SFAs over string x int and minimized all the intersections, complement, difference, and union of such SFAs

Random generation causes many predicate overlaps minterms

Page 43: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

5) MSO logic to DFA transformation

[IJFCS05]State of the art

for MSO

Page 44: Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes 01/24/14, POPL14.

44

ConclusionResults• Adapted classical minimization algorithm to the

symbolic setting• New minimization algorithm for symbolic automata

(faster than previous ones)Future work• Extend to tree automata• Extend classical automata problems to SFAs

– Edit distance?– Regex for symbolic automata?