Embedding Metric Spaces in Their Intrinsic Dimension
Embed Size (px)
description
Transcript of Embedding Metric Spaces in Their Intrinsic Dimension

Embedding Metric Spaces in Their Intrinsic DimensionIttai Abraham , Yair Bartal*, Ofer Neiman
The Hebrew University* also Caltech

Emebdding Metric SpacesMetric spaces (X,dX), (Y,dY)Embedding is a function f : XYDistortion is the minimal such that dX(x,y)dY(f(x),f(y))dX(x,y)

Intrinsic DimensionDoubling Constant : The minimal such any ball of radius r>0, can be covered by balls of radius r/2.Doubling Dimension : dim(X) = log2.
The problem: Relation between metric dimension to intrinsic dimension.
 Previous ResultsGiven a doubling finite metric space (X,d) and 0

Rephrasing the QuestionIs there a lowdistortion embedding for a finite metric space in its intrinsic dimension?
Main result : Yes.

Main ResultsAny finite metric space (X,d) embeds into Lp:With distortion O(log1+n) and dimension O(dim(X)/), for any >0.With constant average distortion and dimension O(dim(X)log(dim(X))).

Additional ResultAny finite metric space (X,d) embeds into Lp:
With distortion and dimension . ( For all D (log n)/dim(X) ).In particular (log2/3n) distortion and dimension into L2.Matches best known distortion result [KLMN 03] for D=(log n)/dim(X) , with dimension O(log n log(dim(X))).

Distance OraclesCompact data structure that approximately answers distance queries.For general npoint metrics:[TZ 01] O(k) stretch with O(kn1/k) bits per label.For a finite doubling metric:O(1) average stretch with (log ) bits per label.O(k) stretch with (1/k) bits per label.Follows from variation on snowflake embedding (Assouad).
 First ResultThm: For any finite doubling metric space (X,d) on n points and any 0

Probabilistic Partitions P={S1,S2,St} is a partition of X if P(x) is the cluster containing x.P is bounded if diam(Si) for all i.A probabilistic partition P is a distribution over a set of partitions. A bounded P is padded if for all xX :

padded PartitionsThe parameter determines the quality of the embedding.[Bartal 96]: =(1/log n) for any metric space.[CKR01+FRT03]: Improved partitions with (x)=1/log((x,)).[GKL 03] : =(1/log ) for doubling metrics.[KLMN 03]: Used to embed general + doubling metrics into Lp : distortion O((log )11/p(log n)1/p), dimension O(log2n).
The local growth rate of x at radius r is:

Uniform Local Padding LemmaA local padding : padding probability for x is independent of the partition outside B(x,).A uniform padding : padding parameter (x) is equal for all points in the same cluster.There exists a bounded prob. partition with local uniform padding parameter (x) :(x)>(1/log )(x)> (1/log((x,)))v1v2v3C1C2(v3) (v1)

Plan:A simpler result of: Distortion O(log n).Dimension O(loglog nlog ).
Obtaining lower dimension of O(log ).Brief overview of:Constant average distortion.Distortiondimension tradeoff.

For each scale iZ, create uniformly padded local probabilistic 8ibounded partition Pi.For each cluster choose i(S)~Ber() i.i.d.
fi(x)=i(Pi(x))min{i1(x)d(x,X\Pi(x)), 8i}
Deterministic upper bound : f(x)f(y) O(log nd(x,y)).
usingEmbedding into one dimensionxd(x,X\Pi(x)Pi

Lower Bound  OverviewCreate a rinet for all integers i.Define success event for a pair (u,v) in the rinet, d(u,v)8i : as having contribution >8i/4 , for many coordinates.In every coordinate, a constant probability of having contribution for a net pair (u,v).Use Lovasz Local Lemma.Show lower bound for other pairs.
 Lower Bound Other Pairs?x,y some pair, d(x,y)8i. u,v the nearest in the rinet to x,y.Suppose that f(u)f(v)>8i/4.We want to choose the net such that f(u)f(x)

uvrinet pair (u,v). Can assume that 8i d(u,v)/4.It must be that Pi(u)Pi(v)
With probability : d(u,X\Pi(u))i8i With probability : i(Pi(u))=1 and i(Pi(v))=0Lower Bound:
 Lower Bound Net Pairsd(u,v)8i. Consider If R

Local LemmaLemma (Lovasz): Let A1,An be bad events. G=(V,E) a directed graph with vertices corresponding to events with outdegree at most d. Let c:VN be rating function of event such that (Ai,Aj)E then c(Ai)c(Aj), if
and thenRating = radius of scale.

Lower Bound Net PairsA success event E(u,v) for a net pair u,v : there is contribution from at least 1/16 of the coordinates.Locality of partition the net pair depend only on nearby points, with distance < 8i.Doubling constant , and ri8i/log n  there are at most loglog n such points, so d=loglog n.Taking D=O(log loglog n) coordinates will give roughly eD= loglog n failure probability.By the local lemma, there is exists an embedding such that E(u,v) holds for all net pairs.

Obtaining Lower DimensionTo use the LLL, probability to fail in more than 15/16 of the coordinates must be < loglog n Instead of taking more coordinates, increase the success probability in each coordinate.If probability to obtain contribution in each coordinate >11/log n, it is enough to take O(log ) coordinates.Similarly, if failure prob. in each coordinate < logn, enough to take O((log )/) coordinates

Using Several ScalesCreate nets only every loglog n scales.A pair (x,y) in scale i (i.e. d(x,y)8i) will find a close net pair in nearest smaller scale i.8iiloglog ni+loglog n

Using Several ScalesTake u,v in the net with d(u,v)8i. A success in one of these scales will give contribution >8iloglog n = 8i/logn.
The success for u,v in each scale is :Unaffected by higher scales eventsIndependent of events far away in the same scale.Oblivious to events in lower scales.Probability that all scales failed

Constant Average DistortionScaling distortion for every 00 and =exp{8j}.Every pair (x,y) needs contribution that depends on:d(x,y).The value of x,y.Sieve the nets to avoid dependencies between different scales and different values of .Show that if a net pair succeeded, the points near it will also succeed.

Constant Average DistortionLower bound contThe local Lemma graph depends on , use the general case of local Lemma.For a net pair (u,v) in scale 8i consider scales: 8iloglog(1/),,8iloglog(1/)/2.Requires dimension O(log loglog ).The net depends on .

DistortionDimension Tradeoff Distortion :Dimension :Instead of assigning all scales to a single coordinate:For each point x: Divide the scales into D bunches of coordinates, in each
Create a hierarchical partition.D (log n)/log Upper bound needs the x,y scales to be in the same coordinates

ConclusionMain result:Embedding metrics into their intrinsic dimension.Open problem: Best distortion in dimension O(log ). Dimension reduction in L2 :For a doubling subset of L2 ,is there an embedding into L2 with O(1) distortion and dimension O(dim(X))?For p>2 there is a doubling metric space requiring dimension at least (log n) for embedding into LP with distortion O(log1/pn).