[Lecture Notes in Computer Science] Foundations of Computation Theory Volume 158 || A tight...

6
A TIGHT ~(!o~log n)-BOUND ON THE TIME FOR PARALLEL,,RAM'S TO COMPUTE NONDEGENERATED BOOLEAN FUNCTIONS Hans-Ulrich Simon Institut f~r angewandte Mathematik und Informatik der Universit~t des Saarlandes D-66OO Saarbr~cken W.-Germany Abstract: A function f:{0,1} n ~ {0,1} is said to depend on dimension i iff there exists an input vector x such that f(x) differs from f(xi), where x i agrees with x in every dimension except i. In this case x is said to be critical for f with respect to i. f is called nondeqenerated iff it de- pends on all n dimensions. The main result of this paper is that for each nondegenerated function f: {O,I} n ~ {0,7} there exists an input vector x which is critical with respect to at least ~(log n) dimensions. A function achieving this bound is presented. Together with earlier results from Cook,Dwork [2] and Reischuk [3] we can conclude that a parallel RAM requires at least Q(loglog n) steps to compute f. 1. Notations and Main Theorem Let us define a PRAM (= Parallel RAM) to consist of a collection of processors which compute synchronously in parallel and which communi- cate with a common global random access memory. At each step each pro- cessor can read from one global memory cell, do some computing and write into one global memory cell. Any number of processors can read a given global memory cell at once, but we allow at most one processor to attempt to write into a given memory cell in one step. At the beginning of the computation of a function f(xl,...,x n) the values Xl,...,x n are stored in the global memory cells CI,...,C n. At the end of the computa- tion f(x I .... ,xn) has to be stored in C I (compare the definitions in [I],[2] and [3]). Let B = {0,1}. For each Boolean function f:B n ~ B and for each input vector x let c(f,x) denote the number of dimensions i such that x is

Transcript of [Lecture Notes in Computer Science] Foundations of Computation Theory Volume 158 || A tight...

A TIGHT ~(!o~log n)-BOUND ON THE TIME FOR PARALLEL,,RAM'S

TO COMPUTE NONDEGENERATED BOOLEAN FUNCTIONS

Hans-Ulrich Simon

Institut f~r angewandte Mathematik und Informatik

der Universit~t des Saarlandes

D-66OO Saarbr~cken

W.-Germany

Abstract:

A function f:{0,1} n ~ {0,1} is said to depend on dimension i iff there

exists an input vector x such that f(x) differs from f(xi), where x i

agrees with x in every dimension except i. In this case x is said to be

critical for f with respect to i. f is called nondeqenerated iff it de-

pends on all n dimensions.

The main result of this paper is that for each nondegenerated function

f: {O,I} n ~ {0,7} there exists an input vector x which is critical with

respect to at least ~(log n) dimensions. A function achieving this

bound is presented.

Together with earlier results from Cook,Dwork [2] and Reischuk [3] we

can conclude that a parallel RAM requires at least Q(loglog n) steps to

compute f.

1. Notations and Main Theorem

Let us define a PRAM (= Parallel RAM) to consist of a collection of

processors which compute synchronously in parallel and which communi-

cate with a common global random access memory. At each step each pro-

cessor can read from one global memory cell, do some computing and

write into one global memory cell. Any number of processors can read a

given global memory cell at once, but we allow at most one processor to

attempt to write into a given memory cell in one step. At the beginning

of the computation of a function f(xl,...,x n) the values Xl,...,x n are

stored in the global memory cells CI,...,C n. At the end of the computa-

tion f(x I .... ,x n) has to be stored in C I (compare the definitions in

[I],[2] and [3]).

Let B = {0,1}. For each Boolean function f:B n ~ B and for each input

vector x let c(f,x) denote the number of dimensions i such that x is

440

critical for f with respect to i. Let c(f) := max {c(f,x) I x£Bn}. In

[2] and [3] it is shown that a PRAM requires at least Q(log(c(f)))

steps to compute f.

F n denotes the set of nondegenerated Boolean functions of n variables.

Let Cn:= min {c(f) I f6Fn}.

Theorem: Let n~2.

I I I < log(n) + 2 log(n) - ~ loglog(n) + ~ < c n

Corollary:

Let f6F n. Then a PRAM requires at least ~(loglog n) steps to com-

pute f.

2. Proof of the Theorem

The upper bound for c n is shown by the following

Example I : n

For x=(x I .... ,x n) 6 B n let v(x):= ~ x.2 n-i. For n=m+2 m let i=I 1

• "" ) := Yv(x) fn(Xl, ..,Xm,Y O, .,Y2m i

fn has the following properties:

(i) f 6 F n n

(ii) C(fn ) = m+1

(iii) A PRAM (with m processors only) can compute fn in ~ log m] + 2

steps.

To prove the lower bound for c n some additional notations are required.

Let G = (V,E) be an undirected graph. The minimum degre ~ md(G) is de-

fined by md(G):= min {degree(v) I v6V}. Any not empty,finite sequence

p=(v I .... ,v r) of vertices such that

Vi£[1:r-1]: {vi,vi+ I} E E

is called a path in G. p is called a cycle iff in addition v1=v r. For

each function f:E ~ B, labeling the edges of G by O. or I, the weight

Wf(p) of p with respect to f is defined by

r-1 Wf(p) := ~ f({vi,vi+1}) , where E denotes the integer sum.

i=I

441

Let E n := {{x,y}l x,y6B n and x,y differ in exactly one dimension} .

The undirected graph Cn:=(Bn,En) is called the n-dimensional cube. For

every function f: B n ~ B let Af denote a function from E n to B defined

by F = ~ I if f(x)~f(y)

Af ({x,y})

L O if f(x)=f(y)

Example I (continued):

= I YO if x1=O f3 (xl,Yo,Yl) [ Yl if xi=I

and Af3, regarded as functions which label the vertices and the edges

of C3, can be drawn as

fO" 1

0 0

0

t

0 0

I I 0

0

The following Lemmata are obvious but important.

Lemma I:

Let G=(V,E) be a not empty,partial subgraph of C n, i.e. ~ ~ V c B n,

E c E . Then IVI ~ 2 md(G)" " n

Lemma 2 :

Let f be a function from B n to B and let c be a cycle in C . Then the n

weight of c with respect to Af is even.

Let f6F n be arbitrary but fixed. From now on we regard the vertices and

the edges of C n as labeled by f and ~f. Let c:=c(f).

442

Observation:

For each vertex x of C n

on X.

there are c(f,x)~c edges labeled I incident

For every vertex x6B n let x~ denote the i'th component of x. Let

i611:n] an arbitrary but fixed dimension. Since f is nondegenerated

there exists an edge {x,x i} labeled 1. W.l.o.g.x.=O and x~=l.- For 1 1

9=O,1 let V9:= {y6Bnl Yi =9}" CJn denotes the (n-1)-dimensional subcube

induced by V.. We define subsets U.cV. in the following way: 3 3 ]

a) yEU O i f f t h e r e e x i s t p a t h s p j i n CJn s u c h t h a t

( i ) PO s t a r t s i n x a n d e n d s i n y .

i i ( i i ) P l s t a r t s i n x a n d e n d s i n g .

( i i i ) w A f ( p o ) = w ~ f ( p l ) = o .

b) y 6 u 1 i f f y l 6 U o .

G. d e n o t e s t h e s u b g r a p h o f C 9 i n d u c e d by U . . T h i s c o n s t r u c t i o n a n d t h e 3 n ]

following claim are illustrated in Figure I.

Figure I:

C ° n

J I Jn-2c+1 Y Y

.........

X

±3n-2c+I

0 " i " 0 ¥1

x

iJ 1

C I n

Claim:

I) x£U O and xiEU1

2) Vy6Uo: {y,yi} is labeled I .

3) md(Go) = md(G1) ~ n-2c+1

Proof of the claim:

I) The paths pO=(X) and p1=(x i) fulfil the conditions (i), (ii) and

(iii) above.

2) Let YEUo, yi6uI ' Po a path weighted O from x to y and Pl a path

443

weighted 0 from x i to yi. Let us assume for the sake of contradic-

tion that {y,yi} is labeled O. Then

po- (y,yi)-p~1" (xi,x) ,

where "." denotes the composition and " -I " the reversal of paths,

is a cycle weighted I in contradiction to Lemma 2. Thus {y,yi} is

labeled I.

3) Let y6U 0 and Di:= [1:n]~{i}. From the observation above and from the

fact that {y,y~} is labeled I it follows:

(I) There are at most c-I dimensions j£D• such that {y,yJ} is la- l

beled I. There are at most c-I dimensions j6D i such that

{yl,y 13} is labeled I, where y 13 agrees with y in every dimen-

sion except i and j.

(2) There are at most 2(c-1) dimensions j£D. such that {y,yJ} is l

labeled I or {yl,y 13} is labeled I.

(3) There are at least n-1-2(c-1)=n-2c+1 dimensions j£D. such that l

{y,yJ} is labeled 0 and {yi,yij} is labeled O.

Thus md(G O) = md(G I) Z n-2c+I

By the claim and Lemma I: iUoi = IUlJ Z 2 n-2c+I . Thus there are at

least 2 n-2c+I edges labeled I in dimension i. Summing over all dimen-

sions we observe that at least n2 n-2c+I edges are labeled I. On the

other hand this number cannot exceed c2 n-1 . Setting

n2 n-2c+I ~ c2 n-1

a straightforward computation shows that

I I I c > ~ log(n) - ~ loglog(n) + ~ ,

provided that n~2. This proves the Theorem.

3. Conclusions

What are the corresponding results in the case of nondegenerated func-

tions f: S1x...xS n ~ S, where SI,...,S n are arbitrary finite sets with

at least two elements ? The following example shows that there exist

nondegenerated functions which are computable by a single processor in

constant time.

Example 2:

For every n6rN let f~ be the function from [1:n]xB n to B given by n

fn(i,xl ..... x n) := x i

With a straightforward modification of this example one can show that

for every s>O there exists a family fs of functions of n variables n

such that Xlr...,x n

For every i611:n] at most O(n s) values can be assigned to x i- (I)

(2) A single processor can compute fa in constant time. n

On the other hand one can show by the main theorem of this paper and an

easy binary coding argument the

Corollary:

Let f be a family of nondegenerated functions of n variables n

xl,...,x n such that at most O(log n) values can be assigned to x i

every i611:n]. Then a PRAM requires at least Q(loglog n) steps to

compute fn"

for

Acknowledgements:

I am grateful to R~diger Reischuk for many helpful discussions and for

pointing me to the family fn in example I, and to Helmut Alt and Kurt

Mehlhorn for suggestions to simplify the original proof.

References:

[I] A.Borodin,J.Hopcroft. Routing and Merging on Parallel Models of

Computation. Proc. 14'th annual ACM, 5/1982. pp.338-344.

[2] S.Cook,C.Dwork. Bounds on the Time for Parallel RAM's to Compute

Simple Functions. Proc. 14'th annual ACM, 5/1982. pp.231-233.

[3] R.Reischuk. A Lower Time Bound for Parallel RAM's without Simul-

taneous Writes. IBM Research Report RJ3431 (40917), 3/1982.