Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes...
-
Upload
lewis-cross -
Category
Documents
-
view
223 -
download
0
Transcript of Minimization of Symbolic Automata Presented By: Loris D’Antoni Joint work with: Margus Veanes...
Minimization of Symbolic Automata
Presented By:
Loris D’Antoni
Joint work with:
Margus Veanes
01/24/14, POPL14
2
What is automata minimization?
Deterministic Finite Automaton
3
a
b
q0 q
ab
A = (Q,q0,F,δ,Σ)
4
Automata Minimization
Minimization = find and collapse equivalent states
p
q
s
s
Non final
Final
distinguishable
0
1 2
3 4 6
55
6
a
b a
a,ba,b
b
a
b
a
b
a
b
0 1,3 2,4 65,6a,b a,ba,b a,b
A simple Application: Random Password generation
Given constraints:• Length is k: "^.{5,20}$"• Contains 2 capital letters: "[A-Z].*[A-Z]"• Contains a digit: "\d“Generate random instances with uniform distribution that match all the above conditions.
6
Key idea
^.{5,20}$
[A-Z].*[A-Z]
\d
7
∩
Problems
8
Big automaton Minimization
Big alphabet 216 characters
in UTF16
Symbolic Automata
Symbolic Finite Automaton (SFA)
9
λx. x mod 2=0
λx. x mod 2=1
q0 q
λx. x mod 2=0λx. x mod 2=1
A = (Q,q0,F,δ,σ) Input sort: in this case int
Separate theory for the input
alphabetSMT SOLVER
Symbolic Finite Automata (SFA)
10
λx. x mod 2=0
λx. x mod 2=1
p q
λx. x mod 2 =0λx. x mod 2=1
1 2 5 3
p p q p p
p is final accept the input
Exe
cuti
on
E
xam
ple
11
Advantages of Symbolic Automata
• Alphabet is represented symbolically– UTF16 abstracted using BDDs – Integer using predicates over integers
• Succinctness– at most n2 transitions– One transition captures many symbols
• BUT: do DFA algorithms generalize to SFAs?
An example: SFA intersection
12
p1 q1
1
p2 q2
2
A1:
A2:
p1
p2
12A1A2:q1
q2X
delete when 12 unsatisfiable
REQUIREMENTS:Input theory must be a Boolean algebra, and
decidable
13
Moore’s algorithm
p
q
p’
q’
distinguishable
a
a
distinguishable
n2 iterations over k symbolsO(kn2)
s
s
14
Symbolic Moore’s algorithm
Initially D = F x (Q\F) U (Q\F) x Ffor each (p’,q’) in D, (p,q) not in Dlet φ,ψ guards of δ(p,p’), δ(q,q’)
if(isSat(φ ∧ ψ))add (p,q) to D
p
q
p’
q’
distinguishable
φ
ψ
distinguishable φ ∧ ψ satisfiable
m transitionsO(m2 f(k))
k = size of biggest predicate in SFA
15
Sometimes Moore is LessFrom: Rani Abdellatif Sent: Tuesday, November 13, 2012 12:55 PM To: Margus Veanes Cc: Patrick McFalls Subject: RE: Password generation help Margus, I tested the perf of the sample you sent me with password lengths from 8 to 15 chars and here are the results:
Chars Time ms 8 171 9 406
10 1061 11 2044 12 3698 13 6271 14 11591 15 18362
This time is the time it takes to run sfa.Determinize(rex.Solver).Minimize(rex.Solver). The time required to create the SFA or generate samples once it’s created is quite small in comparison. We are expecting 15 characters to be on the shorter end of password we’ll generate, going up to 128 characters.
18 sec for 15 characters!
the culprit
should scale up to 128
characters!
16
Hopcroft’s algorithm: intuition
FQ\F
17
Hopcroft’s algorithm: intuition
a
a
a
RA
S
18
Hopcroft’s algorithm: intuition
P3
P2P1 P4
R
Keep partitioning with respect to Wfor every input symbol
b
b
19
Hopcroft’s algorithm: intuition
R
Let’s assume I already split according to R
P2
P1
20
Hopcroft’s algorithm: intuition
RQ
Let’s assume I already split according to R
P2
P1
Do I need to consider both P1 and for P2 future splitting?
21
Hopcroft’s algorithm: intuition
a
a
a
RQ
Let’s assume I already split according to R
P2
P1
Do I need to consider both P1 and for P2 future splitting?
22
Hopcroft’s algorithm: intuition
a
aa
RQ
Let’s assume I already split according to R
P2
P1
Do I need to consider both P1 and for P2 future splitting?
23
Hopcroft’s algorithm: intuition
a
a
a
RQ
Let’s assume I already split according to R
P2
P1
Do I need to consider both P1 and for P2 future splitting?
NO I ONLY NEED ONE!
24
Hopcroft’s algorithm
P := {F, Q\F}W := {if |F|< |Q\F| then F else Q\F}while W != { }
R:=pickFrom(W)foreach a in Σ
S := δ-1(R,a)
while ∃ T ∈ P. T∩S ≠ {} ∧ T \S ≠ {}P,W := split(P, P∩S , P\S)
return partitioned DFA
log n iterationsO(kn log n)
Hopcroft’s algorithm example
0
1 2
3 4 6
55
6
a
b a
a,ba,b
P2P1
b
a
b
a
b
a
b
R
PARTITION: {P1, P2}
TO ANALYZE: {P2}
Hopcroft’s algorithm example
0
1 2
3 4 6
55
6
a
b a
a,ba,b
b
a
b
a
b
a
b
RP2P11 P12
PARTITION: {P11, P12, P2}
TO ANALYZE: {P2, P12}
Hopcroft’s algorithm example
0
1 2
3 4 6
55
6
a
b a
a,ba,b
b
a
b
a
b
a
b
R P2P11 P12
PARTITION: {P11, P12, P2}
TO ANALYZE: {P12}
Hopcroft’s algorithm example
0
1 2
3 4 6
55
6
a
b a
a,ba,b
b
a
b
a
b
a
b
0 1,3 2,4 65,6a,b a,ba,b a,b
29
Symbolic Hopcroft’s algorithm
P := {F, Q\F}W := {if |F|< |Q\F| then F else Q\F}while W != { }
R:=pickFrom(W)foreach a in Σ
S := δ-1(R,a)
while ∃ T ∈ P. T∩S ≠ {} ∧ T \S ≠ {}P,W := split(P, P∩S , P\S)
return partitioned DFA
Alphabet might not be finite
30
Finitize the alphabet
φ1 φ2
φ3φ‘7
φ'3
φ‘1
φ‘4
φ‘2
φ‘5
φ‘6
φ‘8
Predicates:{x>5, x<10, x=3}
Minterms:{x=3, x≤5, 5<x<10, x≥10}
31
Symbolic Hopcroft’s algorithm
P := {F, Q\F}W := {if |F|< |Q\F| then F else Q\F}while W ≠ {}
R:=pickFrom(W)foreach φ in Minterms(A)
S := δ-1(R, φ)
while ∃ T ∈ P. T∩S ≠ {} ∧ T \S ≠
{}P,W := split(P, P∩S , P\S)
return partitioned DFA
log n iterationsO(2mnlog n+2mf(mk))
We need something better
32
New Algorithm: Intuition
Φ
ψ
A R
P1
P2
p
q What if Φ ≠ ψ?
Φ\ψ
Example 1/2
0
1 2
3 4 6
55
6
x<0
x≥0
-2<x<5
-5<x<3-2<x<5
-5<x<3
truetrue
FQ\F
false ≠ -5<x<3
R
Example 1/2
0
1 2
3 4 6
55
6
x<0
x≥0
-2<x<5
-5<x<3-2<x<5
-5<x<3
truetrue
R
Example 2/2
r65p q true
x<2
x<5
x≥2
x≥5
Both p and q go to r, but…
x≥2 x≥5 ?? NO
Then p is distinguishable from q
R
Example 2/2
r65p q true
x<2
x<5
x≥2
x≥5
Both p and q go to r, but…
x≥2 x≥5 ?? NO
Then p is distinguishable from q
R
37
New Algorithm
P := {F, Q\F}W := {if |F|< |Q\F| then F else Q\F}while W ≠ { }
R := pickFrom(W); S := δ-1(R, true);while ∃ A ∈ P. A∩S ≠ {} ∧ ∃p1,p2. δ-1(p1) ≠ δ-1(p2)
P,W := split(P, P∩S , P\S, witness(δ-1(p1) ≠ δ-
1(p2))
return partitioned DFA
log n iterationsO(n2log n f(nk))
Experiments
1. Randomly generated DFAs SFAs using BDDs (sort = bitvec 7 bits)
2. SFAs generated from regexesSFAs using BDDs (sort = bitvec 16 bits)
3. A corner case of Minterm generationSFAs using BDDs (sort = bitvec 20 bits)
4. Randomly generated SFAs over string x intSFAs over using Z3 (sort = string x int)
5. Monadic second order logic to DFA transformationSFAs using BDDs (sort = bitvec 40 bits)
1) Randomly generated DFAs5 billion DFAs: 10 to 100 states, 2 to 50 symbols From [Almeida, Moreira, Reis, TR05]
2) SFAs generated from regexes (regexplib.com)
3000 regexes over UTF16 alphabet (216 elems)From [regexplib.com]
Both axis logscale
More States =>Moore Worse
3) A corner case of Minterm generation
This SFA has 2k minterms!!
brics.automata.dkUses intervals instead
of BDDs
Logscale
4) Randomly generated SFAs over string x int
Randomly generated 10 SFAs over string x int and minimized all the intersections, complement, difference, and union of such SFAs
Random generation causes many predicate overlaps minterms
5) MSO logic to DFA transformation
[IJFCS05]State of the art
for MSO
44
ConclusionResults• Adapted classical minimization algorithm to the
symbolic setting• New minimization algorithm for symbolic automata
(faster than previous ones)Future work• Extend to tree automata• Extend classical automata problems to SFAs
– Edit distance?– Regex for symbolic automata?