1
Approximate Satisfiability and Equivalence
Michel de Rougemont
University Paris II & LRI
Joint work with E. Fischer, Technion,
F. Magniez, LRI ,
LICS 2006
2
1. Tester on a class K: approximation of decision problems
• Equality between two strings (trees) : (ε2, ε) Tolerant tester , additive approximation of the Edit Distance with Moves.
• Membership: is w in L ?
3. Equivalence tester between two regular properties: polynomial time algorithm (Exact Equivalence is PSPACE complete)
4. Generalizations: regular trees, context-free languages, infinite words,
5. Current research: probabilistic systems.
Plan
3
• Decisions on noisy inputs (distance to a language)
• Model Checking: can we approximate hard problems? Bounded MC, Abstraction, …..
• Black-Box Checking: does B satisfies P ?
Motivations
B
L
€
AGCTAGGA.....ACT
€
x
€
y
€
Learn B'
€
Decide B' ≈ε B and B'⊆ε P
4
Let F be a property on a class K of structures U:
An ε -tester for F is a probabilistic algorithm A such that:• If U |= F, A accepts• If U is ε far from F, A rejects with high probability
F is testable if there is a probabilistic algorithm A such that
• A is an ε -tester for all ε • Time(A) is independent of n=size(U).
Robust characterizations of polynomials, R. Rubinfeld, M. Sudan, 1994Property Testing and its connection to Learning and Approximation. O. Goldreich, S.
Goldwasser, D. Ron, 1996.
Tester usually implies a linear time corrector. (ε1, ε2)-Tolerant Tester
1. Testers on a class K
5
1. Satisfiability : T |= F
2. Approximate Satisfiability T |= F
3. Approximate Equivalence
Image on a class K of trees
F ¬F F
F fromfar -ε
ε
Approximate Satisfiability and Equivalence
GF ε≡
G
6
1. Classical Edit Distance:Insertions, Deletions, Modifications
2. Edit Distance with moves : dist(w,w’)
0111000011110011001
0111011110000011001
3. Edit Distance with Moves generalizes to Ordered Trees
Edit Distances with Moves
{ }'( , ') ; ( , ) ( , ')
W Ldist W W dist W L Min dist W W∈=
7
Tester for equality: Block and Uniform statistics
W=001010101110 length n, b.stat: consecutive subwords of length k, n/k blocksu.stat: any subwords of length k, n-k+1 blocks, shingles
1401
61)(. ⎟
⎟
⎠
⎞
⎜⎜
⎝
⎛=Wstatb
#....
#
/1)(.
2
1
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛=
kn
n
knWstatb ...."00...1" ofnumber #"00...0" ofnumber #
2
1
nn
"11...1" ofnumber #
....2kn
For k=2, n/k=6 2
441
111)(. ⎟
⎟
⎠
⎞
⎜⎜
⎝
⎛=Wstatu
1)'(.)(. :studyMain WstatuWstatu −
1ε=k
8
Tester for equality
Edit distance with moves. NP-complete problem, but approximable in constant time with additive error.
Uniform statistics ( ): W=001010101110
Theorem 1. |u.stat(w)-u.stat(w’)| approximates dist(w,w’)/n.
Sample N subwords of length k, compute Y(w) and Y(w’):
Lemma (Chernoff). Y(w) approximates u.stat(w).
Corollary. |Y(w)-Y(w’)| approximates dist(w,w’)/n.
Tester 1: If |Y(w)-Y(w’)| <ε. accept, else reject.
1)(
...1∑=
=Ni
iXNwY
0...010
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
=iX
2441
111)(. ⎟
⎟
⎠
⎞
⎜⎜
⎝
⎛=Wstatu
1)'(
...1∑=
=Ni
iXNwY
1ε=k
9
Soundness: ε-close strings have close statistics
Robustness: ε-far strings have far statistics
We prove:1. b.stat is robust 2. u.stat is sound3. u.stat is robust (harder)
Theorem 1 :Soundness and Robustness
.)',( nwwdist ε≤
.)',( nwwdist ε≥
).)'(.)(. .21()',( nwstatbwstatbwwdist ε+−≤
( , '). . . ( ) . ( ') . . ( ) . ( ')
dist w wd u stat w u stat w c u stat w u stat w
nε − ≤ ≤ −
hard
.6)'(.)(. .)',( 2 εε ≤−⇒≤ wstatuwstatunwwdist
10
Robustness of b.stat
Robustness of b-stat: ).)'(.)(. .21()',( nwstatbwstatbwwdist ε+−≤
.)',( then )'(.)(. If nwwdistwstatbwstatb ε≤=
)'()''( t.s. 'w'construct then )'(.)(. If wstatbwstatbwstatbwstatb −=−≠
1401
61)(. ⎟
⎟
⎠
⎞
⎜⎜
⎝
⎛=Wstatb
1302
61)'(. ⎟
⎟
⎠
⎞
⎜⎜
⎝
⎛=Wstatb
in W' 3 andin W 4 "10" #but in W' 2 andin W 1"00"# ==
: Example on w. onssubstituti )'(.)(.2
most at after wstatbwstatb.n −
"10" intoit change andin W "00" ofblock one take:'W'
11
Soundness of u.stat
Soundness of u-stat:
Simple edit:
Move w=A.B.C.D, w’=A.C.B.D:
Hence, for ε2.n operations,
Remark: b.stat is not sound.Problem: robustness of u.stat ? Harder! We need an auxiliary distribution and two key lemmas.
.6)'(.)(. .)',( 2 εε ≤−⇒≤ wstatuwstatunwwdist
ε.2
12)'(.)(.
nknkwstatuwstatu ≤+−
≤−
.6
1)1(3.2)'(.)(. εnkn
kwstatuwstatu ≤+−−≤−
.6)'(.)(. ε≤− wstatuwstatu
12
Statistics on words
k
k
Kt k-t
Block statistics: b.stat
Uniform statistics: u.stat
Block Uniform statistics: bu.stat
1ε=k
)(. ii vstatbX =)(. 11 vstatbX =
1v iv
))(.())(.()(./,...1
vstatbEvstatbEnKwstatbu
Kniiti== ∑
=
. 2kcK=
13
Uniform Statistics
ABKnkbu −=−− )1).(1( : by missed k length of subwords#
., onsdistributi uniform twoand ALet : Lemma BA μμB⊆A
BB
AB−=− .2. Then BA μμ
).
()(.)(. 4
/2
nOwstatuwstatbu ε
εΣ=−
εε
/2
3. ,1 with lemma previous Apply the
Σ≈+−=
nKknB
.)(. )(. w 4
/2
nwstatuwstatbu ε
εΣ≤−∀Lemma 2:
14
Block Uniform Statistics
))(.())(.()(./,...1
vstatbEvstatbEnKwstatbu
Kniiti== ∑
=
1][0 ],)[(.][ ),(. ≤≤== uXuvstatbuXvstatbX iiiii
])[(. is on Average t.independen is ][Each uwstatbui uXi
2Kn-8
e]])[(.])[(.])[(.Pr[ : Bound Chernofft
uwstatbutuwstatbuuvstatb ≤×≥−2
Kn-8k
.e])(.)(.)(.Pr[ : BoundUnion t
wstatbutwstatbuvstatb Σ≤×≥−0]
2)(.)(.Pr[
2. tandn enough largeFor k >≤−⇒
Σ= εε wstatbuvstatb
€
∀w∃v bu.stat(w) − b.stat(v) ≤ε
2 and dist(v,w) ≤ cε
Lemma 1:
15
Robustness of the uniform Statistics
Lemma 2:
Lemma 1:
.5,6)'(. )(. .5)',( εε ≥−⇒≥ wstatuwstatunwwdist
2)(.)(. vw ε≤−∃∀ vstatbwstatbu
.)(. )(. w 4
/2
nwstatuwstatbu ε
εΣ≤−∀
w' w,from close v'Get v,
stat.u- of robustness impliesstat -b of Robustness
Tolerant tester:
Theorem: for two words w and w’ large enough, the tester:1. Accepts if w=w’ with probability 1 2. Accepts if w,w’ are ε2-close with probability 2/33. Rejects if w,w’ are ε-far with probability 2/3
..5)',( ).)'(.)(. .21( :bstat of Robustness nwwdistnwstatbwstatb εε ≥≥+−
.5)'( )( ifAccept ),O(cN εε ≤−= wYwY
(Probabilistic method)
16
1. Membership: decide if
2. Inclusion and Equivalence
Equivalence tester
3. Testers for Membership and Equivalence
1 2 1 2 if v (except finitely) v is close to L L L v Lε ε⊆ ∀ ∈ ⇒ −
122121 and if LLLLLL εεε ⊆⊆≡
accepts then If 2121 ) ,rA(rLL =
32y probabilit with rejects then ) ( If 2121 ≥≡¬ ),r(A rLL ε
, of tionsrepresenta finite 2121 LL,rr
or is far from w L w Lε∈ −
17
Automata for Regular languages
A: automaton with m states on Σ, Ak automaton with m states on Σk.Basic property:
Proposition:
Caratheodory’s theorem: in dimension d, convex hull of N pointscan be decomposed into in the union of convex hulls of d+1 points.
Large loops can be decomposed. Small loops (less than m=|A|) suffice.
))(.),...,(.Hull(Convex-Let 1
0 t,,...v1
t
v
vstatbvstatbt
U≥
=Η
, where..... to is 1 mvuvvuclosewLw il ≤−⇒∈ ε
€
v1 ....v l { } is a multi - set of Ak - compatible loops
))(.),...,(.Hull(Convex- 1
1, t,,...vk
1
t
mvv
vstatbvstatbit
U≤+Σ=
=Η
18
Approximate Parikh mapping
Lemma: For every X in H, w of size n s. t.
δ≤)(.-X wstatb
X .
b-stat(w)
w
nn
mOw,L )).
.
.(
2()dist(
/1
εεδ
εΣ++≤
H is a fair representation of L
€
Lemma : If w ∈ L, then b.stat(w)∈ H
19
Example
Y(w)
5.0 ,2 *1*)10(*)01(*1*0*)010( ==+= εkr
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛
03/13/13/1
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛
0001
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛
1000
0
0
1
0
⎛ ⎞⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠
0
1
0
0
⎛ ⎞⎜ ⎟⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠
H={stat(w) : w in r } is a union of polytopes.
2 Polytopes for r.
H
Membership Tester:
Compute ( ). Accept if ( ( ) ) , else reject.Y w dist Y w ,H ε≤
20
Construction of H in polynomial time
k.mΣ
€
Pt = b − stat(i →t j) { }
⎛
⎝
⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟
Enumeration of all m loops:
Number of b-stat of words of length m on Σk is less than : Some loops have same b-stat: ABBA and BBAA
Construct H by matrix iteration:
k.Σm
11 tt PPP o=+
1,...mfor t , and between length t of word: =→ jiji t
of loops compatibleA ofb.stat 1 of hullsconvex Consider k k+Σ
€
size less than m .
2 size has H then and step with discretize We
)(kO
k
Σ
Σεε
21
Membership tester
Membership Tester for w in L (regular):1. Construction of the tester: Precompute Hε 2. Tester: Compute Y(w) (approx. b.stat(w)). Accept iff Y(w) is at distance less than ε to Hε
Construction: Time is Tester: query complexity in time complexity in
Remark 1: Time complexity of previous testers was exponential in m.
Remark 2: The same method works for L context-free.
O(k)Σ
O(k).Σm
2O(k).Σ
22
Equivalence Tester for regular properties
1 2Tester for inclusion : r r⊆
1 2 ?H Hε ε⊆ε1H
ε2H
1 2Equivalence Tester for : r rε≡
1 2 2 1 and ?H H H Hε ε ε ε⊆ ⊆
Time polynomial in m=Max(|A |, |B |): O(k).Σm
23
3. Generalization: Trees
(1,(1,(1,.),1),.)=c
(1,.)=c
T: Ordered (extended) Tree of rank 2 T’: squeleton
W: word with labels. Apply u.stat on W and define u.stat(T).
24
Infinite words
Buchi Automata. Distance on infinite words:Two words are ε-close if
A word is ε-close to a language L if there exists w’ in L s. t. W and w’ are ε-close.
Statistics: set of accumulation points of
H: compatible loops of connected components of accepting states
Tester for Buchi Automata: • Compute HA and HB
• Reject if HA and HB are different.
Approximate Model-Checking
€
limsupn →∞ dist(w(n),w'(n)) ≤ ε
w(n))(. nstatb
25
Other Logics
Equivalence of Context-free grammars is undecidable, Approximate Equivalence in exponential time.
Consider formulas in different Logics (LFP, m-calculus,…..). Can Equivalence, Implication be approximated on a definable class K with a distance?
Definability and approximation: can first-order definable classes of trees testable with the Edit distance?
26
4. Probabilistic Systems
Probabilistic Automata: Ma is a stochastic matrix for letter a. If w=a1 a1 …. an then Mw =Ma1 …. Man
PM Probabilistic Membership: Is ut.Mw.v> λ ?
APM: Approximate Probabilistic Membership: Let P= ut.Mw.v> λ
•Decide if w satisfies P or if w is ε -far from P.
27
Approximations for Probabilistic Automata
1. Approximate probabilities• Introduce ε around λ
• Approximate membership
• Approximate Equivalence (Tzeng 92) is harder than Equivalence.
2. Approximate distances between states• Generalization of bisimulation
• Desharnais et al., Van Breugel-Worell
3. Our approach: Approximation on the input
http://www.lri.fr/~mdr/verap
28
Basic Decompositions in H
s1=abcs2=ba
s3=bc
s4=aa s3=ccc
H1 H2
W=aabaaaaababcabcabcabcabcabcbc close to
W’=(aa)3(ba)2(abc)6
N Samples approximate ustat(W) close to :λ1.ustat((aa)*)+ λ2.ustat((ba)*) + λ3.ustat((abc)*)
29
Basic Decompositions in H1
For each summit s, basic loop in A, let h(s,n)=Probability to follow s after n iterations of s
Analyze all loops mutliple of s: h(s,n)= rn for n large enough.
Analyze all possible decompositions of ustat(w) in H:
s1=abcs2=ab
s3
s4=aa
30
Claim Hypothesis: all simple loops are distinct.
Input W of length n
Claim: Upper bound for ut.Mw’.v for W’ close to W.
λ1.ustat((aa)*)+ λ2.ustat((ba)*) + λ3.ustat((abc)*) indicates densities λ1, λ2, λ3 to follow aa, ba, abc on Hi.
We need to connect loops aa, ba, abc by some inputs: there are finitely many possibilities. Let Ci the best probability.
31 2 .. .. .
nn ni aa ba abcB r r r λλ λ=
31
Tester for APM in O(1) Input W of length n
Tester(W,k)• Sample W with N(k) samples,
• Select H such that ustat is close,
• Decompose ustat on possible subpolytopes Hi with at most d+1 summits, and obtain a bound Bi,
• Consider all possible links on Hi, let Ci the optimal bound,
• Let D=Maxi Ci.Di
If λ< D, Accept else Reject.
32
Non determinism and Probabilistic
Can we combine both non determinism and probabilistic behaviors?
Stationary distributions for a given scheduler are distributions on the states, for which there is also a polytope representation. Classical results exist about positional schedulers.
Problem: does the projection of these distributions on ustat vectors keep the distances? Thesis of Mathieu Tracol.
For large scale systems, evolutionary games also provide a statistical representation of the states. Can we predict approximate properties of the Equilibria?
33
Conclusion
1. Tolerant Tester for Equality on strings under the Edit Distance with Moves
• Additive approximation in O(1) of the EDM
2. Equivalence tester for automata• Polynomial time approximate algorithm (PSPACE-complete)• Generalization to Buchi automata : approximate Model-
Checking• Context-Free Languages: exponential algorithm (exact problem
is undecidable)
3. Generalization to trees, infinite words4. Probabilistic systems.
Top Related