EFFECTIVE QUANTIFIER ELIMINATION OVER REAL CLOSED FIELDSmasnnv/Talks/tut.pdf · Cylindrical cell...

EFFECTIVE QUANTIFIER

ELIMINATION OVER REAL

CLOSED FIELDS

N. Vorobjov

1

Tarski (1931, 1951):

Elementary algebra and geometry is decidable.

More generally, there is an algorithm for quan-

tifier elimination in the first order theory of the

reals.

Boolean formula F (x1, . . . ,xν) with atoms of

the kind f > 0, where f ∈ R[x0,x1, . . . ,xν],

xi = (xi,1, . . . , xi,ni).

Φ(x0) := Q1x1Q2x2 · · ·QνxνF (x0,x1, . . . ,xν),

where Qi ∈ {∃, ∀}, Qi 6= Qi+1.

Theorem 1. There is a quantifier-free formula

Ψ(x0) such that Ψ(x0) ⇔ Φ(x0) over R,

i.e., {x0 ∈ Rn0|Ψ(x0)} = {x0 ∈ Rn0|Φ(x0)}.

Theorem remains true if R is replaced by any

real closed field R, e.g., real algebraic numbers,

Puiseux series over R in an infinitesimal.

2

Theorem 2. If atoms f ∈ Z[x0,x1, . . . ,xν], then

there is an algorithm which eliminates quanti-

fiers from Φ(x0), i.e., for a given Φ(x0) pro-

duces Ψ(x0).

Corollary 3. The first order theory of the reals

is decidable, i.e., there is an algorithm which

for a given closed formula

Q1x1Q2x2 · · ·QνxνF (x1, . . . ,xν)

decides whether it’s true or false.

3

Integer coefficients are mentioned here to be

able to handle the problem on a Turing ma-

chine. If we are less picky about the model

of computation, for example allow exact arith-

metic operations and comparisons in a real

closed field R, then Theorem ?? and Corol-

lary ?? are also true for formulae over R. In

what follows we will use exactly this approach.

3-1

Let R be a real closed field.

Definition 4. A set X ⊂ Rn is called

semialgebraic if X is representable in a form

X = {x ∈ Rn|F (x)}, where F (x) is a quantifier-

free formula.

Corollary 5. Let X = {x0 ∈ Rn0|Φ(x0)}, where

Φ(x0) := Q1x1Q2x2 · · ·QνxνF (x0,x1, . . . ,xν).

Then X is semialgebraic.

4

Examples:

• Feasibility of a system of polynomial equa-

tions and inequalities

(= is a semialgebraic set empty?)

• Representation of the closure (or interior,

or singular locus,

or tubular neighbourhood, etc.) of a semi-

algebraic set by a Boolean combination of

polynomial inequalities

• Polynomial optimization: find the set of

points of local minima of a polynomial

function on a semialgebraic set.

5

There is no need to explain to logicians the

usefulness of quantifier elimination. Instead I

list a few straightforward examples of how it

can be used in “ordinary” mathematics.

Semialgebraic set in all the examples initially

can be naturally represented by prenex formu-

lae with quantifiers (sometimes involving ε/δ

language.

5-1

Complexity:

Tarski’s quantifier elimination algorithm:

“non-elementary” (can’t be bounded from

above by any tower of exponentials of a finite

height).

Problem: construct an efficient algorithm

Possible approach: cylindrical (algebraic) cell

decomposition (mid-70s, Wuthrich, Collins)

6

Definition 6. Cylindrical cell is defined by in-duction as follows.

1. Cylindrical 0-cell in Rn is an isolated point.Cylindrical 1-cell in R is an open interval(a, b) ⊂ R.

2. For n ≥ 2 and 0 ≤ k < n a cylindrical(k + 1)-cell B in Rn is either a graph ofa continuous bounded function f : C → R,where C is a cylindrical a cylindrical (k+1)-cell in Rn−1, or else a set of the form

{(x1, . . . , xn) ∈ Rn| (x1, . . . , xn−1) ∈ C

andf(x1, . . . , xn−1) < xn < g(x1, . . . , xn−1)},where C is a cylindrical k-cell in Rn−1, andf, g : C → R are continuous bounded func-tions such that

f(x1, . . . , xn−1) < g(x1, . . . , xn−1)

for all points (x1, . . . , xn−1) ∈ C.

7

Cylindrical cell decomposition is a very use-

ful tool when we study o-minimal structures.

Since I’ll need this concept also in my other

lecture, let me give a definition. We first de-

fine a cylindrical cell. One can notice that any

cylindrical cell is homeomorphic to an open ball

of the matching dimension (semialgebraically

homeomorphic in the case of an arbitrary R).

We will give the definition only for the bounded

case.

7-1

Definition 7. Cylindrical cell decomposition Dof a subset A ⊂ Rn is defined by induction as

follows.

1. If n = 1, then D is a finite family of pair-

wise disjoint cylindrical cells (i.e., isolated

points and intervals) whose union is A.

2. If n ≥ 2, then D is a finite family of pair-

wise disjoint cylindrical cells in Rn whose

union is A and there is a cylindrical cell de-

composition D′ of π(A) such that π(C) is

its cell for each C ∈ D, where π : Rn →Rn−1 is the projection map onto the co-

ordinate subspace of x1, . . . , xn−1. We say

that D′ is induced by D.

Definition 8. Let B ⊂ A ⊂ Rn and D be a CCD

of A. Then D is compatible with B if for any

C ∈ D we have either C ⊂ B or C ∩B = ∅ (i.e.,

some subset D′ ⊂ D is a CCD of B).

8

Example:

9

Theorem 9. (Collins, Wuthrich). If X ⊂ Rn is

a semialgebraic set, then there is an algorithm

for computing CCD of Rn compatible with X.

Moreover, CCD is semialgebraic, i.e., each cell

is described by a system of polynomial equa-

tions and inequalities.

Technical tools: resultants and sub-resultants.

Compare with classical elimination theory

(van-der-Waerden, Modern Algebra) and

Chevalley’s Theorem.

Corollary 10. There is an algorithm for quan-

tifier elimination.

10

The corollary is almost obvious from the defini-

tion of the CCD. One reasons by induction on

quantifiers starting from Qν and moving out-

ward. On each step we use the relation (de-

scribed in the definition) between the decom-

position of the space of all current variables

and the decomposition of the subspace of cur-

rent free variables.

10-1

Complexity:

Let Φ(x0) involve s atoms of the kind f > 0,

where f ∈ R[x0, . . . ,xν], n := n0 + · · ·+nν, and

deg(f) < d.

Then the complexity of constructing CCD and

the corresponding quantifier elimination has an

upper bound

(sd)O(1)n.

Similar upper bound on the number of cells,

degrees and quantities of polynomials describ-

ing CCD and quantifier-free formula equivalent

to Φ.

If coefficients of polynomials f are integer of

bit-sizes less than M , then the (Turing ma-

chine) complexity does not exceed

MO(1)(sd)O(1)n.

11

Davenport & Heintz (1988) (similar results by

Weispfenning (1988)) found a (parametric)

example of Φ of degrees ≤ 4, with two free

variables and

n1 ≤ 2, . . . , nν ≤ 2

(i.e., each quantifier is responsible for at most

two variables) defining in R2 a semialgebraic

set consisting of

22n

isolated points.

⇒ Any CCD for Φ can’t contain lesser num-

ber of cells.

⇒ Lower complexity bound for any

CCD-based algorithm is doubly exponential.

Fisher & Rabin (1974): lower bound for deci-

sion methods is exponential in the number of

quantifier alternations.

12

Simplest case:

∃x ∈ R (f(x) = 0),

where f ∈ R[x], deg(f) < d.

Sturm’s Theorem (1835):

Euclidean algorithm for the division of f by itsderivative f ′.

Let g1 = f, g2 = f ′, and g3, . . . , gt be the se-quence of the negatives of the remainders inthe Euclidean algorithm. For any x ∈ R letV (x) be the number of sign changes ing1(x), g2(x), . . . , gt(x).

Theorem 11. The number of distinct realroots of f in [a, b] ⊂ R is

S(f, f ′) := V (a)− V (b).

Note:sign(f(∞)) =sign(leading coefficient a0 of f)sign(f(−∞)) =sign(a0xd at −1).

13

Note that over an algebraically closed field the

answer is always “Yes”.

Sturm’s theorem provides an algorithm to de-

cide the existential formula over any real closed

field.

Now we want to generalize this algorithm to be

able to handle systems of many equations and

inequalities (in one variable, for time being).

13-1

Ben-Or/Kozen/Reif algorithm (1986):

Definition 12. Let f1, . . . , fk ∈ R[x],deg(fi) < d. Consistent sign assignment is astring σ = (σ1, . . . , σk) where σi ∈ {>, <,=},such that the system

f1σ10, . . . , fkσk0

has a solution in R.

Input:f1, . . . , fk ∈ R[x].Output:The list of all consistent sign assignments.

Prepare the input:Make all fi squarefree and relatively prime.Then sets of roots of polynomials fi aredisjoint, in particular any consistent signassignment has at most one =. Moreover, wecan add a new polynomial f so that all con-sistent assignments for f1, . . . , fk can be recon-structed from all consistent assignments of <

and > for f1, . . . , fk at roots of f .

14

Case k = 0: f = 0 Sturm’s Theorem.

Case k = 1: f, f1

C1 := {x| f = 0 ∧ f1 > 0}

C1 := {x| f = 0 ∧ f1 < 0}Obviously, the total number of roots of f is

S(f, f ′) = |C1|+ |C1|.(Here and in the sequel S corresponds to the

interval [−∞,∞].)

Lemma 13. S(f, f ′f1) = |C1| − |C1|.

Sturm + Lemma ⇒(1 11 −1

) (|C1||C1|

)=

(S(f, f ′)

S(f, f ′f1)

)

15

Case k = 2: f, f1, f2

C2 := {x|f = 0 ∧ f2 > 0}

C2 := {x|f = 0 ∧ f2 < 0}

1 1 1 11 −1 1 −11 1 −1 −11 −1 −1 1

|C1 ∩ C2||C1 ∩ C2||C1 ∩ C2||C1 ∩ C2|

=

=

S(f, f ′)S(f, f ′f1)S(f, f ′f2)

S(f, f ′f1f2)

16

In case k we get a matrix Ak which is a tensor

product of k copies of A1 and is non-singular.

Its unique solution describes all consistent sign

assignments for f = 0, f1, . . . , fk. Solving the

system by any standard method (e.g., Gaus-

sian elimination), we list all consistent assign-

ments. However this naive approach leads to

the complexity exponential in k.

Polynomial-time algorithm:

deg(fi) < d ⇒ the number of all roots in all fi

is O(kd) ⇒ the number of all consistent sign

assignments is O(kd).

“Divide-and-conquer” then leads to complexity

polynomial in kd.

17

PLAN:

X = {x0 ∈ Rn0|Φ(x0)}, whereΦ(x0) := Q1x1Q2x2 · · ·QνxνF (x0,x1, . . . ,xν).

1. The algorithm eliminates quantifiersby induction starting from Qν outwards.

2. It’s sufficient to consider just the case of

Φ(y) := ∃x F (x,y),

where x ∈ Rn, y ∈ Rm.

3. We will first construct an algorithm for thedecidability problem

∃xF (x)

and then “parameterize” the algorithmwith vector y.

(3) → (2) → (1)

18

∃xF (x), x ∈ Rn, F (x) is a Boolean formula

with s atoms of the kind f ∗ 0, f ∈ R[x],

∗ ∈ {=, >, <,≥,≤}, deg(f) < d.

In case n = 1: Ben-Or/Kozen/Reif applied

to the set of all atomic polynomials. The list of

all consistent sign assignments allows to decide

whether ∃xF (x) is true.

Arbitrary n: Similar strategy: list all

consistent sign assignments.

How: Find a finite set Λ ⊂ Rn such that any

consistent assignment defines a system of

equations and strict inequalities having a solu-

tion in this set. Knowing Λ it’s easy to find

the list of consistent assignments.

19

Points from Λ are (isolated) solutions of sys-

tems of polynomials equations based on atoms

of F (x).

Solving systems of equations:

Let a system of polynomial equations have at

most finite number of solutions over algebraic

closure R.

Possible approaches:

• Grobner bases

• Effective Hilbert’s Nullstellensatz

• u-resultants

We will use u-resultants.

20

Homogeneous polynomials:A polynomial in n variables is an expression:

∑

(i1,...,in)

ai1,...,inxi11 · · ·xin

n .

A polynomial is homogeneous or a form ifi1 + · · ·+ in =const for all (i1, . . . , in) withai1,...,in 6= 0.

Examples:Homogeneous: x2 + xy

Non-homogeneous: x2 − 1

0 is a root of any non-constant homogeneouspolynomial.

If x = (x1, . . . , xn) ∈ Rn is a root of a homo-geneous polynomial h, then for any a ∈ R thepoint ax = (ax1, . . . , axn) is also a root of h.

⇒ If x 6= 0, then there is a straight line ofroots of h passing through 0 and x. We cansay that the line itself is a root of h.

21

Projective space:The space of all straight lines through the ori-gin is called (n− 1)-dimensional projectivespace Pn−1(R).

An element of Pn−1(R), which is a line passingthrough 0 and x = (x1, . . . , xn) 6= 0, is denotedby (x1 : x2 : · · · : xn).

Any polynomial h ∈ R[x] can be homogenized:Let deg(h) = d. Introduce a new variable x0and a homogeneous polynomialhom(h) := xd

0h(x1/x0, . . . , xn/x0).

If (x1, . . . , xn) is a root of h, then(1 : x1 : · · · : xn) is a root of hom(h). If(y0 : y1 : · · · : yn) ∈ Pn−1(R) is a root of hom(h)and y0 6= 0, then (y1/y0, . . . , yn/y0) is a root ofh.Of course, hom(h) can have a root of the form(0 : z1 : · · · : zn) which does not correspond inthis sense to any root of h, it’s called a “rootat infinity”.

22

Similar relation is true for systems of polyno-

mial equations: homogenized system has not

less solutions than the original system.

Example:

original system: x1 − x2 = x1 − x2 + 1 = 0

homogenization: x1 − x2 = x1 − x2 + x0 = 0.

Original is inconsistent, the homogenized has

a unique solution (0 : 1 : 1) (at infinity).

23

u-resultant:

Consider a system of n homogeneous equa-

tions in n + 1 variables: h1 = · · · = hn = 0.

The system is always consistent in Pn(R).

It can have either finite or infinite number of

solutions.

Introduce new variables u0, u1, . . . , un.

Proposition 14. There is a homogeneous

polynomial Rh1,...,hn ∈ R[u0, . . . , un], called

u-resultant, such that Rh1,...,hn 6≡ 0 iff the

system has finite number of solutions.

If Rh1,...,hn 6≡ 0, then

Rh1,...,hn =∏

1≤i≤r

(χ(i)0 u0 + · · ·+ χ

(i)n un),

where (χ(i)0 : · · · : χ

(i)n ), 1 ≤ i ≤ r are all solu-

tions in Pn(R) of the system.

24

Assume that in ∃xF (x) formula F (x) is just a

system of inequalities

f1 ≥ 0 ∧ · · · ∧ fs ≥ 0,

and that {x| F (x)} ⊂ Rn is bounded.

Let deg(fi) < d for all 1 ≤ i ≤ s and D be

minimal even integer ≥ sd + 1.

Introduce a new variable ε and consider

g(ε) :=∏

1≤i≤s

(fi + ε)− εs+1 ∑

1≤j≤n

xDj .

Lemma 15. If the system f1 ≥ 0 ∧ · · · ∧ fs ≥ 0

is consistent, then there is a solution which is

a limit of a solution of

∂g(ε)

∂x1= · · · = ∂g(ε)

∂xn= 0

as ε → 0.

25

We consider just the case of a system of in-

equalities to simplify the explanations. The

case of an arbitrary formula is slightly more

geometrically complicated, but very similar in

spirit.

If {x| F (x)} is not bounded, then it can be

reduced to the bounded cased by intersecting

this set with a ball of sufficiently large radius.

25-1

Picture:

26

Lemma 16. For all sufficiently small ε > 0, if

f1 ≥ 0 ∧ · · · ∧ fs ≥ 0

is consistent, then the homogenization of

∂g(ε)

∂x1= · · · = ∂g(ε)

∂xn= 0

has a finite number of solutions.

Proof. Grobner basis or u-resultant techniques.

Denote the homogenization by

G1(ε) = · · · = Gn(ε) = 0.

By Proposition, every solution of this system

is a coefficient vector in a divisor of the

u-resultant RG1(ε),...,Gn(ε).

27

The lemma of course means that the system it-

self also has a finite number of solutions, since

the homogenization always has more solutions.

27-1

Represent RG1(ε),...,Gn(ε) in the form

RG1(ε),...,Gn(ε) = Rmεm +∑

j>m

Rjεj,

where m is the lowerst degree w.r.t. ε.

Lemma 17.

Rm =∏

1≤i≤r

(η(i)0 u0 + · · ·+ η

(i)n un),

where (η(i)0 , . . . , η

(i)n ) is the limit of

(χ(i)0 , . . . , χ

(i)n ) as ε → 0.

Corollary 18. If f1 ≥ 0 ∧ · · · ∧ fs ≥ 0 is

consistent, then its solution is

(η(i)1 /η

(i)0 , . . . , η

(i)n /η

(i)0 ),

where η(i)0 6= 0 and (η(i)

0 u0 + · · ·+ η(i)n un)

is a divisor of Rm.

28

If we were able to factorize Rm over R, then

the algorithm for consistency of

f1 ≥ 0∧· · ·∧fs ≥ 0 could be roughly as follows:

• Construct the u-resultant RG1(ε),...,Gn(ε);

• Find the lowest term Rm w.r.t. ε in

RG1(ε),...,Gn(ε);

• Factorize Rm finding the limits

(η(i)0 , . . . , η

(i)n ) (when ε → 0);

• Check whether

(η(i)1 /η

(i)0 , . . . , η

(i)n /η

(i)0 )

satisfies f1 ≥ 0 ∧ · · · ∧ fs ≥ 0.

If yes, then this system is is consistent, else

it’s inconsistent.

29

Unfortunately, we are unable to factorize a

polynomial in general case. Even in special

cases, when effective multivariate factorization

algorithms are known, they are very involved.

Therefore, our scheme will be different.

29-1

Consider in the affine space Rn+1 the hyper-surface

{(u0, . . . , un) ∈ Rn+1|Rm = 0}.

Since Rm factors into linear forms, the hy-persurface is a union of hyperplanes passingthrough 0.

η(i)0 u0+ · · ·+η

(i)n un = 0 is the equation defining

hyperplane number i, 1 ≤ i ≤ r;⇒ vector (η(i)

0 , . . . , η(i)n ) is orthogonal to the

hyperplane i;⇒ vector (η(i)

0 , . . . , η(i)n ) is collinear to the gra-

dient of Rm at a non-singular point x of thehyperplane i, i.e., to

(∂Rm

∂u0(x), . . . ,

∂Rm

∂un(x)

).

To find a finite set of non-singular points onall hyperplanes, intersect {Rm = 0} with a“generic” straight line in Rn+1 defined withcoefficients from R (actually, from Z).

30

Algorithm for deciding ∃x F (x):

• Write the homogeneous systemG1(ε) = · · · = Gn(ε) = 0.

• Construct u-resultant RG1(ε),...,Gn(ε)(how – later).

• Find the lowest term Rm w.r.t. ε inRG1(ε),...,Gn(ε).

• Find a generic straight line in the paramet-ric form αt + β, where α, β ∈ Rn+1, t is asingle variable. The roots of the univariateequation Rm(αt+β) = 0 correspond to thepoints of intersection of the line with hy-perplanes. If f1 ≥ 0 ∧ · · · ∧ fs ≥ 0 is consis-tent, then for a root t′ of Rm(αt + β) = 0the following point is its solution:(

∂Rm/∂u1

∂Rm/∂u0(αt′+β), . . . ,

∂Rm/∂un

∂Rm/∂u0(αt′+β)

).

31

• Denoting

yj(αt + β) :=∂Rm/∂uj

∂Rm/∂u0(αt + β),

we get:

The system f1 ≥ 0∧· · ·∧fs ≥ 0 is consistent

iff there exists t′ ∈ R such that

Rm(αt′ + β) = 0 and

fi(y1(αt′ + β) ≥ 0 ∧ · · · ∧ yn(αt′ + β)) ≥ 0

for all 1 ≤ i ≤ s.

• Check consistency of the latter system of

univariate polynomials inequalities using

Ben-Or/Kozen/Reif.

32

Technical details we left out:

• Computing u-resultant (polynomial of

degree (sd)O(n)).

• Finding generic straight line in Rn+1.

• Case of vanishing gradient ⇔η(i)0 u0+· · ·+η

(i)n un occurs in Rm with power

> 1 ⇔root t′ of Rm(αt + β) = 0 is multiple.

• Case of an arbitrary Boolean formula as

F (x).

Complexity:

Straightforward calculation: (sd)O(n).

33

Accomplished item (3) of the PLAN ↑:deciding ∃xF (x).

Now, item (2): quantifier elimination in

∃x F (x,y).

Main idea:

Consider variables y as parameters and try to

execute the decidability algorithm for a para-

metric formula (i.e., with variable coefficients).

Decidability algorithm uses only +, ∗ and com-

parisons over elements of R.

Arithmetic can be easily performed parametri-

cally, while comparisons require branching.

The parametric algorithm is represented by

algebraic decision tree.

34

Picture:

35

Input of the tree is y. Each vertex of the tree

is associated with a polynomial in y, which is

the composition of arithmetic operations per-

formed along the branch leading from the root

to this vertex.

Upon a specialization of input y, the polyno-

mial at the root is evaluated, and the value is

compared to zero. Depending on the result of

comparison, one of the three branches is cho-

sen, thereby determining the next vertex and

the polynomial evaluation to be made, and so

on until a leaf is reached.

Each leaf is assigned the value “True” or

“False”. All specializations of y, arriving at

a leaf marked “True”, correspond to closed

formulae that are true. Similar with “False”.

35-1

An elimination algorithm:

• Use the decision algorithm parametrically

to obtain a decision tree with the variable

input y.

• Determine all branches from the root to

leaves marked True.

• Take conjunction of the inequalities along

each such branch:∧

i∈branch

hi(y)σi0,

where σi ∈ {<, >,=}.

•∃x F (x,y) ⇔

∨

branches

∧

i∈branch

hi(y)σi0

36

Difficulty:

Too many branches in the tree:

Height = complexity of decidability = (sd)O(n)

⇒ 3(sd)O(n)branches (leaves)

But:

Most branches are not followed by any spec-

ification of y, i.e., most leaves correspond to

Boolean formulae defining ∅.

Because:

37

Lemma 19. Let in the finite set g1, . . . , gk ∈R[x1, . . . , xn] all degrees deg(gi) < D. Then

the number of all consistent sign assignments

for this set is less than (kD)O(n).

Carefully examining the decidability algorithm

we see that the number of distinct polynomials

hi in the tree is essentially the same as in one

branch, i.e., (sd)O(n). From the complexity es-

timate we know that their degrees are (sd)O(n).

Then, by Lemma, the number of distinct 6= ∅sets associated with leaves is (sd)O(n2).

⇒Complexity of ∃ elimination:

(sd)O(n2)

∀ elimination:

∀xF (x) ⇔ ¬∃x¬F (x)

38

General case (item (1) of the PLAN ↑):

Q1x1Q2x2 · · ·QνxνF (x0,x1, . . . ,xν),

where xi ∈ Rni, n := n0 + · · ·+ nν.

Complexity:

(sd)nO(ν)

Compare with CCD algorithm:

(sd)O(1)n

Using finer technical tools one can construct

quantifier elimination algorithm having com-

plexity

(sd)∏

0≤i≤ν O(ni)

39

References:

1. J Renegar, Recent progress on the

complexity of the decision problem for the

reals, DIMACS Series in Discrete Maths.

and Theoretical Comp. Sci., 6, 1991,

287–308.

2. S Basu, R Pollack, M-F Roy, Algorithms in

Real Algebraic Geometry, Springer, Berlin-

Heidelberg, 2003.

40

EFFECTIVE QUANTIFIER ELIMINATION OVER REAL CLOSED FIELDSmasnnv/Talks/tut.pdf · Cylindrical cell...

Documents

Transcript of EFFECTIVE QUANTIFIER ELIMINATION OVER REAL CLOSED FIELDSmasnnv/Talks/tut.pdf · Cylindrical cell...