Embedding Metric Spaces in Their Intrinsic Dimension

Embedding Metric Embedding Metric Spaces in Their Intrinsic Spaces in Their Intrinsic

DimensionDimension

Ittai Abraham , Yair Bartal*, Ofer NeimanIttai Abraham , Yair Bartal*, Ofer Neiman

The Hebrew UniversityThe Hebrew University* also Caltech* also Caltech

Emebdding Metric SpacesEmebdding Metric Spaces

►Metric spaces Metric spaces (X,d(X,dXX), (Y,d), (Y,dYY))

►EmbeddingEmbedding is a function is a function f : Xf : X→→YY►Distortion is the minimal Distortion is the minimal αα such that such that

ddXX(x,y)≤d(x,y)≤dYY(f(x),f(y))≤(f(x),f(y))≤αα·d·dXX(x,y)(x,y)

Intrinsic DimensionIntrinsic Dimension

►Doubling ConstantDoubling Constant : The minimal : The minimal λλ such any ball of radius such any ball of radius r>0r>0, can be , can be covered by covered by λλ balls of radius balls of radius r/2r/2..

►Doubling DimensionDoubling Dimension : dim( : dim(XX) = ) = loglog22λλ..

►The problemThe problem: Relation between : Relation between metricmetric dimension to dimension to intrinsicintrinsic dimension. dimension.

Previous ResultsPrevious Results

► Given a Given a λλ-doubling -doubling finite metric space finite metric space (X,d)(X,d) and and 0<0<γγ<1,<1, it’s it’s snow-flakesnow-flake version version (X,d(X,dγγ)) can can be embedded into be embedded into LLpp with distortion and with distortion and dimension depending only ondimension depending only on λλ [Assouad [Assouad 83].83].

► ConjectureConjecture (Assouad) : This hold for (Assouad) : This hold for γγ=1=1..► Disproved by Semmes.Disproved by Semmes.

► A lower bound on distortion of for A lower bound on distortion of for LL22, with a matching upper bound , with a matching upper bound [GKL 03][GKL 03]..

nlog

Rephrasing the QuestionRephrasing the Question

► Is there a low-distortion embedding for a Is there a low-distortion embedding for a finitefinite metric space in its metric space in its intrinsic dimensionintrinsic dimension??

Main result : Main result : YesYes..

Main ResultsMain Results

►Any finite metric space Any finite metric space (X,d)(X,d) embeds embeds into into LLpp:: With distortion With distortion O(logO(log1+1+θθn)n) and and dimension dimension

O(dim(X)/O(dim(X)/θθ),), for any for any θθ>0>0.. With With constant average distortionconstant average distortion and and

dimension dimension O(dim(X)log(dim(X))).O(dim(X)log(dim(X))).

Additional ResultAdditional Result

►Any finite metric space Any finite metric space (X,d)(X,d) embeds into embeds into LLpp::

►With distortion With distortion and dimension .and dimension . ( For all ( For all DD≤≤ (log n)/dim(X) (log n)/dim(X) ). ). In particular In particular Õ(logÕ(log2/32/3n)n) distortion and dimension into distortion and dimension into LL22.. Matches best known distortion result Matches best known distortion result [KLMN 03][KLMN 03] for for

D=(log n)/dim(X)D=(log n)/dim(X) , with dimension , with dimension O(log n O(log n log(dim(X))).log(dim(X))).

pp DnnO 111 loglog

DnXDO loglogdim~

Distance OraclesDistance Oracles

► Compact data structure that approximately Compact data structure that approximately answers distance queries.answers distance queries.

► For general For general nn-point metrics:-point metrics: [TZ 01][TZ 01] O(k)O(k) stretch with stretch with O(knO(kn1/k1/k)) bits per label. bits per label.

► For a finite For a finite λλ-doubling metric:-doubling metric: O(1)O(1) average stretch with average stretch with Õ(log Õ(log λλ)) bits per label. bits per label. O(k)O(k) stretch with stretch with Õ(Õ(λλ1/k1/k)) bits per label. bits per label.

Follows from variation on “snow-flake” embedding (Assouad).

First ResultFirst Result

►ThmThm:: For any finite For any finite λλ-doubling-doubling metric metric space space (X,d)(X,d) on on nn points and any points and any 0<0<θθ<1<1 there exists an embedding of there exists an embedding of (X,d)(X,d) into into LLpp with distortion with distortion O(logO(log1+1+θθn)n) and and dimension dimension O((log O((log λλ)/)/θθ)) ..

Probabilistic Partitions Probabilistic Partitions ► P={SP={S11,S,S22,…S,…Stt}} is a partition of is a partition of XX ifif

► P(x)P(x) is the cluster containing is the cluster containing xx..► PP is is ΔΔ-bounded-bounded if if diam(Sdiam(Sii)≤)≤ΔΔ for all for all ii..

► A A probabilistic partitionprobabilistic partition PP is a distribution is a distribution over a set of partitions. over a set of partitions.

► A A ΔΔ-bounded -bounded PP is is ηη-padded-padded if for all if for all xxєєX :X :

XSSSji ii

ji ,:

21,Pr xPxBP

ηη-padded Partitions-padded Partitions► The parameter The parameter ηη determines the quality of the determines the quality of the

embedding.embedding.► [Bartal 96]:[Bartal 96]: ηη==ΩΩ(1/log n)(1/log n) for any metric space. for any metric space.► [CKR01+FRT03]:[CKR01+FRT03]: Improved partitions with Improved partitions with

ηη(x)=1/log((x)=1/log(ρρ(x,(x,ΔΔ)).)).► [GKL 03] :[GKL 03] : ηη==ΩΩ(1/log (1/log λλ)) for for λλ-doubling metrics.-doubling metrics.► [KLMN 03]:[KLMN 03]: Used to embed general + doubling metrics Used to embed general + doubling metrics

into into LLpp : distortion : distortion O((log O((log λλ))1-1/p1-1/p(log n)(log n)1/p1/p),), dimension dimension O(logO(log22n).n).

The The local growth ratelocal growth rate of of xx at radius at radius rr is: is:

64,

64,,

rxB

rxBrx

Uniform Local Padding Uniform Local Padding LemmaLemma

► A A locallocal padding : padding probability for padding : padding probability for xx is is independent of the partition outside independent of the partition outside B(x,B(x,ΔΔ))..

► A A uniform uniform padding : padding parameter padding : padding parameter ηη(x)(x) is equal is equal for all points in the for all points in the same clustersame cluster..

► There exists a There exists a ΔΔ-bounded prob. partition with -bounded prob. partition with local local uniformuniform padding parameter padding parameter ηη(x)(x) : : ηη(x)>(x)>ΩΩ(1/log (1/log λλ)) ηη(x)> (x)> ΩΩ(1/log((1/log(ρρ(x,(x,ΔΔ))))))

v1v2

v3

C1C2

η(v3)

η(v1)

Plan:Plan:

►A simpler result of: A simpler result of: Distortion Distortion O(log n).O(log n). Dimension Dimension O(loglog n·log O(loglog n·log λλ).).

►Obtaining lower dimension of Obtaining lower dimension of O(log O(log λλ).).►Brief overview of:Brief overview of:

Constant average distortion.Constant average distortion. Distortion-dimension tradeoff.Distortion-dimension tradeoff.

► For each scale For each scale iiєєZZ, create , create uniformly paddeduniformly padded local local probabilistic probabilistic 88ii-bounded partition -bounded partition PPii..

► For each cluster choose For each cluster choose σσii(S)~Ber(½)(S)~Ber(½) i.i.d. i.i.d.

ffii(x)=(x)=σσii(P(Pii(x))·min{(x))·min{ηηii-1-1(x)·(x)·d(x,X\Pd(x,X\Pii(x)), (x)), 88ii}}

► Deterministic upper boundDeterministic upper bound : :

|f(x)-f(y)| |f(x)-f(y)| ≤≤ O(log O(log n·d(x,y)).n·d(x,y)).

usingusing

i

i xfxf

EmbeddingEmbedding into one dimensioninto one dimension

nOxxi

i

ii log8,log1

x

d(x,X\Pi(x)

Pi

Lower Bound - OverviewLower Bound - Overview

► Create a Create a rrii-net for all integers -net for all integers i.i.

► Define Define success eventsuccess event for a pair for a pair (u,v) (u,v) in the in the rrii--netnet, d(u,v), d(u,v)≈≈88ii : as having contribution > : as having contribution >88ii/4 , /4 , for many coordinates.for many coordinates.

► In every coordinate, a In every coordinate, a constant constant probability probability of having contribution for a net pair of having contribution for a net pair (u,v).(u,v).

► Use Use Lovasz Local LemmaLovasz Local Lemma..► Show lower bound for other pairs.Show lower bound for other pairs.

i8u

v

► rrii-net pair -net pair (u,v).(u,v). Can assume that Can assume that 88ii ≈≈d(u,v)/4.d(u,v)/4.► It must be thatIt must be that P Pii(u)≠P(u)≠Pii(v)(v)

► With probability ½ :With probability ½ : d(u,X\Pd(u,X\Pii(u))≥(u))≥ηηii88ii

► With probabilityWith probability ¼ : ¼ : σσii(P(Pii(u))=1 and (u))=1 and σσii(P(Pii(v))=0(v))=0

iiiiii uuvfuf 8081

Lower Lower Bound:Bound:

Lower Bound – Net PairsLower Bound – Net Pairs

► d(u,v)≈8d(u,v)≈8ii. Consider . Consider ► If If R<8R<8ii/2/2 : :

With prob. With prob. 1/8 f1/8 fii(u)-f(u)-fii(v)(v)≥≥ 8 8ii..

► If If RR≥≥ 8 8ii/2/2 : : With prob. With prob. 1/4 f1/4 fii(u)=f(u)=fii(v)(v)=0.=0.

► In any caseIn any case

► Lower scales do not matter Lower scales do not matter u v

ij

jj vfufR

28i

ijjj vfuf

488 i

ij

j

ijjj vfuf

ηi(u) 8i

The good event for pair in scale i depend on higher scales, but has constant probability given any outcome for them.Oblivious to lower scales.

Local LemmaLocal Lemma

► Lemma (Lovasz):Lemma (Lovasz): Let Let AA11,…A,…Ann be “bad” events.be “bad” events. G=(V,E)G=(V,E) a directed graph with vertices a directed graph with vertices corresponding to events with out-degree at most corresponding to events with out-degree at most dd. . Let Let c:Vc:V→→NN be “rating” function of event such that be “rating” function of event such that ((AAii,A,Ajj))єєEE then then c(c(AAii)≥c()≥c(AAjj)), if , if

andand

thenthen

pAAQj

ji

Pr jiji AcAcEAAjQ ,

0Pr][

njjA

11 depRating = radius of scale.

Lower Bound – Net PairsLower Bound – Net Pairs

► A A success eventsuccess event E(u,v)E(u,v) for a net pair for a net pair u,vu,v : there is : there is contribution from at least 1/16 of the coordinates.contribution from at least 1/16 of the coordinates.

► Locality of partition – the net pair depend only on Locality of partition – the net pair depend only on “nearby” points, with distance < “nearby” points, with distance < 88ii..

► Doubling constant Doubling constant λλ, and , and rrii≈8≈8ii/log n/log n - there are at - there are at most most λλloglogloglog nn such points, so such points, so d=d=λλloglogloglog nn..

► Taking Taking D=O(logD=O(log λλ·loglog n)·loglog n) coordinates will give coordinates will give roughly roughly ee-D-D= = λλ-loglog-loglog nn failure probability. failure probability.

► By the local lemma, there is exists an embedding By the local lemma, there is exists an embedding such that such that E(u,v)E(u,v) holds for holds for all all net pairsnet pairs..

Obtaining Lower DimensionObtaining Lower Dimension

► To use the LLL, probability to fail in more To use the LLL, probability to fail in more than 15/16 of the coordinates must be < than 15/16 of the coordinates must be < λλ--

loglogloglog nn ► Instead of taking more coordinates, Instead of taking more coordinates, increase increase

the success probabilitythe success probability in each coordinate. in each coordinate.► If probability to obtain contribution in If probability to obtain contribution in each each

coordinatecoordinate >1-1/ >1-1/log nlog n, it is enough to take , it is enough to take O(log O(log λλ)) coordinates. coordinates.

Similarly, if failure prob. in each coordinate < log-θn, enough to take O((log λ)/θ) coordinates

Using Several ScalesUsing Several Scales

► Create nets only every Create nets only every θθloglog nloglog n scales. scales.► A pair A pair (x,y)(x,y) in scale in scale i’ i’ (i.e.(i.e. d(x,y)≈8 d(x,y)≈8i’i’) will find a ) will find a

close net pair in nearest smaller scale i.close net pair in nearest smaller scale i.► 88i’i’<log<logθθn·8n·8ii, so lose a factor of , so lose a factor of loglogθθnn in the in the

distortion.distortion.► Consider scales Consider scales i-i-θθloglog n,…,i.loglog n,…,i.

i

i’θloglog n<

i-θloglog n

i+θloglog n

Using Several ScalesUsing Several Scales

► Take Take u,v u,v in the net within the net with d(u,v)≈8 d(u,v)≈8ii. . ► A success in one of these scales will give A success in one of these scales will give contribution >contribution >88i-i-θθloglog n loglog n = 8= 8ii/log/logθθn.n.

► The success for The success for u,vu,v in each scale is : in each scale is : Unaffected by higher scales eventsUnaffected by higher scales events Independent of events “far away” in the Independent of events “far away” in the

same scale. same scale. Oblivious to events in lower scales.Oblivious to events in lower scales.

► Probability that Probability that allall scales failed< scales failed<(7/8)(7/8)θθloglog nloglog n..► Take only Take only D=O((log D=O((log λλ)/)/θθ)) coordinates. coordinates.

Lose a factor of logθn in the distortion`

i

i-θloglog n

i+θloglog n

Constant Average DistortionConstant Average Distortion

► Scaling distortionScaling distortion – for every – for every 0<0<εε<1<1 at most at most εε·n·n22 pairs pairs with distortion > with distortion > polylog(1/polylog(1/εε))..

► Upper boundUpper bound of of log(1/log(1/εε),), by standard techniques. by standard techniques.► Lower boundLower bound::

Define a net for any scale Define a net for any scale i>0i>0 and and εε=exp{-8=exp{-8jj}}.. Every pair Every pair (x,y)(x,y) needs contribution that depends on: needs contribution that depends on:

► d(x,y).d(x,y).► TheThe εε-value-value of of x,yx,y..

SieveSieve the nets to avoid dependencies between different the nets to avoid dependencies between different scales and different values of scales and different values of εε..

Show that if a net pair succeeded, the points near it will Show that if a net pair succeeded, the points near it will also succeed.also succeed.

Constant Average DistortionConstant Average Distortion

► Lower bound cont…Lower bound cont… The local Lemma graph depends on The local Lemma graph depends on εε, use the , use the general general

casecase of local Lemma. of local Lemma. For a net pair For a net pair (u,v)(u,v) in scale in scale 8 8ii – consider scales: – consider scales:

88ii-loglog(1/-loglog(1/εε),…,8),…,8ii-loglog(1/-loglog(1/εε)/2.)/2. Requires dimensionRequires dimension O(log O(log λλ·loglog ·loglog λλ).).

The net depends on λλ..

Distortion-Dimension Distortion-Dimension Tradeoff Tradeoff

► Distortion :Distortion :► Dimension :Dimension :► Instead of assigning all scales to a single Instead of assigning all scales to a single

coordinate:coordinate: For each point x:For each point x:

Divide the scales into Divide the scales into D bunchesD bunches of coordinates, of coordinates, in eachin each

Create a Create a hierarchical hierarchical partition.partition.

DnxBunchi

i log1

pp DnnO 111 loglog

DnDO logloglog~

D D ≤≤ (log n)/log (log n)/log λλ

Upper bound needs the x,y scales to be in the same coordinates

ConclusionConclusion

►Main result:Main result: Embedding metrics into their Embedding metrics into their

intrinsic dimension.intrinsic dimension.

►Open problem: Open problem: Best distortion in dimension O(log Best distortion in dimension O(log λλ)). . Dimension reduction in LDimension reduction in L22 : :

►For a doubling subset of LFor a doubling subset of L2 2 ,is there an ,is there an embedding into Lembedding into L2 2 with O(1) distortion and with O(1) distortion and dimension O(dim(X))?dimension O(dim(X))?

For p>2 there is a doubling metric space requiring dimension at least Ω(log n) for embedding into LP with distortion O(log1/pn).

Embedding Metric Spaces in Their Intrinsic Dimension

Documents

Transcript of Embedding Metric Spaces in Their Intrinsic Dimension