Embedding Metric Spaces in Their Intrinsic Dimension
Embed Size (px)
description
Transcript of Embedding Metric Spaces in Their Intrinsic Dimension
-
Embedding Metric Spaces in Their Intrinsic DimensionIttai Abraham , Yair Bartal*, Ofer Neiman
The Hebrew University* also Caltech
-
Emebdding Metric SpacesMetric spaces (X,dX), (Y,dY)Embedding is a function f : XYDistortion is the minimal such that dX(x,y)dY(f(x),f(y))dX(x,y)
-
Intrinsic DimensionDoubling Constant : The minimal such any ball of radius r>0, can be covered by balls of radius r/2.Doubling Dimension : dim(X) = log2.
The problem: Relation between metric dimension to intrinsic dimension.
- Previous ResultsGiven a -doubling finite metric space (X,d) and 0
-
Rephrasing the QuestionIs there a low-distortion embedding for a finite metric space in its intrinsic dimension?
Main result : Yes.
-
Main ResultsAny finite metric space (X,d) embeds into Lp:With distortion O(log1+n) and dimension O(dim(X)/), for any >0.With constant average distortion and dimension O(dim(X)log(dim(X))).
-
Additional ResultAny finite metric space (X,d) embeds into Lp:
With distortion and dimension . ( For all D (log n)/dim(X) ).In particular (log2/3n) distortion and dimension into L2.Matches best known distortion result [KLMN 03] for D=(log n)/dim(X) , with dimension O(log n log(dim(X))).
-
Distance OraclesCompact data structure that approximately answers distance queries.For general n-point metrics:[TZ 01] O(k) stretch with O(kn1/k) bits per label.For a finite -doubling metric:O(1) average stretch with (log ) bits per label.O(k) stretch with (1/k) bits per label.Follows from variation on snow-flake embedding (Assouad).
- First ResultThm: For any finite -doubling metric space (X,d) on n points and any 0
-
Probabilistic Partitions P={S1,S2,St} is a partition of X if P(x) is the cluster containing x.P is -bounded if diam(Si) for all i.A probabilistic partition P is a distribution over a set of partitions. A -bounded P is -padded if for all xX :
-
-padded PartitionsThe parameter determines the quality of the embedding.[Bartal 96]: =(1/log n) for any metric space.[CKR01+FRT03]: Improved partitions with (x)=1/log((x,)).[GKL 03] : =(1/log ) for -doubling metrics.[KLMN 03]: Used to embed general + doubling metrics into Lp : distortion O((log )1-1/p(log n)1/p), dimension O(log2n).
The local growth rate of x at radius r is:
-
Uniform Local Padding LemmaA local padding : padding probability for x is independent of the partition outside B(x,).A uniform padding : padding parameter (x) is equal for all points in the same cluster.There exists a -bounded prob. partition with local uniform padding parameter (x) :(x)>(1/log )(x)> (1/log((x,)))v1v2v3C1C2(v3) (v1)
-
Plan:A simpler result of: Distortion O(log n).Dimension O(loglog nlog ).
Obtaining lower dimension of O(log ).Brief overview of:Constant average distortion.Distortion-dimension tradeoff.
-
For each scale iZ, create uniformly padded local probabilistic 8i-bounded partition Pi.For each cluster choose i(S)~Ber() i.i.d.
fi(x)=i(Pi(x))min{i-1(x)d(x,X\Pi(x)), 8i}
Deterministic upper bound : |f(x)-f(y)| O(log nd(x,y)).
usingEmbedding into one dimensionxd(x,X\Pi(x)Pi
-
Lower Bound - OverviewCreate a ri-net for all integers i.Define success event for a pair (u,v) in the ri-net, d(u,v)8i : as having contribution >8i/4 , for many coordinates.In every coordinate, a constant probability of having contribution for a net pair (u,v).Use Lovasz Local Lemma.Show lower bound for other pairs.
- Lower Bound Other Pairs?x,y some pair, d(x,y)8i. u,v the nearest in the ri-net to x,y.Suppose that |f(u)-f(v)|>8i/4.We want to choose the net such that |f(u)-f(x)|
-
uvri-net pair (u,v). Can assume that 8i d(u,v)/4.It must be that Pi(u)Pi(v)
With probability : d(u,X\Pi(u))i8i With probability : i(Pi(u))=1 and i(Pi(v))=0Lower Bound:
- Lower Bound Net Pairsd(u,v)8i. Consider If R
-
Local LemmaLemma (Lovasz): Let A1,An be bad events. G=(V,E) a directed graph with vertices corresponding to events with out-degree at most d. Let c:VN be rating function of event such that (Ai,Aj)E then c(Ai)c(Aj), if
and thenRating = radius of scale.
-
Lower Bound Net PairsA success event E(u,v) for a net pair u,v : there is contribution from at least 1/16 of the coordinates.Locality of partition the net pair depend only on nearby points, with distance < 8i.Doubling constant , and ri8i/log n - there are at most loglog n such points, so d=loglog n.Taking D=O(log loglog n) coordinates will give roughly e-D= -loglog n failure probability.By the local lemma, there is exists an embedding such that E(u,v) holds for all net pairs.
-
Obtaining Lower DimensionTo use the LLL, probability to fail in more than 15/16 of the coordinates must be < -loglog n Instead of taking more coordinates, increase the success probability in each coordinate.If probability to obtain contribution in each coordinate >1-1/log n, it is enough to take O(log ) coordinates.Similarly, if failure prob. in each coordinate < log-n, enough to take O((log )/) coordinates
-
Using Several ScalesCreate nets only every loglog n scales.A pair (x,y) in scale i (i.e. d(x,y)8i) will find a close net pair in nearest smaller scale i.8ii-loglog ni+loglog n
-
Using Several ScalesTake u,v in the net with d(u,v)8i. A success in one of these scales will give contribution >8i-loglog n = 8i/logn.
The success for u,v in each scale is :Unaffected by higher scales eventsIndependent of events far away in the same scale.Oblivious to events in lower scales.Probability that all scales failed
-
Constant Average DistortionScaling distortion for every 00 and =exp{-8j}.Every pair (x,y) needs contribution that depends on:d(x,y).The -value of x,y.Sieve the nets to avoid dependencies between different scales and different values of .Show that if a net pair succeeded, the points near it will also succeed.
-
Constant Average DistortionLower bound contThe local Lemma graph depends on , use the general case of local Lemma.For a net pair (u,v) in scale 8i consider scales: 8i-loglog(1/),,8i-loglog(1/)/2.Requires dimension O(log loglog ).The net depends on .
-
Distortion-Dimension Tradeoff Distortion :Dimension :Instead of assigning all scales to a single coordinate:For each point x: Divide the scales into D bunches of coordinates, in each
Create a hierarchical partition.D (log n)/log Upper bound needs the x,y scales to be in the same coordinates
-
ConclusionMain result:Embedding metrics into their intrinsic dimension.Open problem: Best distortion in dimension O(log ). Dimension reduction in L2 :For a doubling subset of L2 ,is there an embedding into L2 with O(1) distortion and dimension O(dim(X))?For p>2 there is a doubling metric space requiring dimension at least (log n) for embedding into LP with distortion O(log1/pn).