Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency...

Post on 17-Dec-2015

226 views 4 download

Transcript of Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency...

Konstantinos Tsakalidis

1

Dynamic Data Structures:Orthogonal Range Queries

and Update Efficiency

Konstantinos Tsakalidis

PhD Defense23 September 2011

Konstantinos Tsakalidis

2

Κωνσταντίνος Τσακαλίδης

2000-2006 B. Eng. Computer Engineering and Informatics Dpt., University of Patras, Greece

Sum. 2007 InternGoogle Inc., Mountain View, California, USA

2007-2009 Ph. D. Student (Part A)MADALGO, Aarhus University, Denmark

Sum. 2010 Visiting Prof. Ian Munro D. Cheriton School of Computer Science, University of Waterloo, Canada

2009-2011 Ph. D. Student (Part B)

Konstantinos Tsakalidis

3

Overview

Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Konstantinos Tsakalidis

4

Databases and GeometryName Age Salary Date Phone …

Andreas 30 5.500 2/2010 555-4321 …

Maria 6.500 4/1998 555-3214 …

John 25 3.000 5/2011 555-2143 …

Helen 34 4.000 1/2000 555-1432 …

Jacob 28 7.000 11/1989 555-1234 …

Planar (D=2) Euclidean Space

38

Query Operation• Question about stored dataUpdate Operation/Transaction• Insert/Delete Tuple• Change Value

N points D dimensions

29

Salary

Age

Date

Name

Phone

Konstantinos Tsakalidis

5

Models of Computation

Pointer Machine

Record

O(1) fields

word-RAM I/O Model[Aggarwal, Vitter ‘88]

Space

w bits/cell

O(1) Time

N M<NN

B

B words

N/B

M/B

I/O Operation

#Occupied Records

#Arithmetic Operations +#Pointer TraversalsTime

#Occupied Cells

#Arithmetic Operations+#cell READ/WRITEs

#Occupied Blocks

#I/O Operations

specialized database

Memory Disk

Konstantinos Tsakalidis

6

Overview

Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Konstantinos Tsakalidis

7

Orthogonal Range Reporting Queries

Salary

Age

1000

Contour Query Report all points with: Salary > 1000

Dominance Query Report all points with: Salary > 1000 and Age > 35

35

2000

3-Sided Query Report all points with: 2000 > Salary > 1000 and Age > 35

Employees

Konstantinos Tsakalidis

8

I/O Model Space Query I/Os Update I/Os

External Priority Search Tree [Arge’99]

amo.

[ICDT ’10] Amortized Expected w.h.p.

[ICDT ’10]

Expected w.h.p.

Amortized Expected w.h.p.

[ISAAC‘09]Expected w.h.p. Amortized Expected

[ISAAC ’09]Expected w.h.p.

Expected amortized

Worst-Case EfficientDynamic 3-Sided Range Reporting

word-RAM Space Query Time Update Time

Fusion Tree [Willard’00]

[Mortensen’06]

I/O Model Space Query I/Os Update I/Os

External Priority Search Tree [Arge’99]

amo.

Space Query Time Update Time

Priority Search Tree[McCreight’85]

Pointer Machineword-RAM

[ICDT ’10] Expected w.h.p.

[ICDT ’10]Expected w.h.p.

Expected w.h.p.

X, Y: μ-random

X: smoothY: restrictedX: smooth

X, Y: μ-random

X: smoothY: restricted

X: smooth

Average-Case EfficientDynamic 3-Sided Range Reporting

Konstantinos Tsakalidis

9

Unknown non-changing μ-Random probabilistic distribution (f,g)-Smooth distribution

Not exceed a specific bound, no matter how small subinterval Includes regular, uniform distributions Any distribution is (f,Θ(n))-smooth

Restricted class of distributions Few elements occur very often Many elements occur rarely Zipfian, Power Law Distributions

Probabilistic Distributions

Smooth

Restricted

Konstantinos Tsakalidis

10

Priority Search Tree [McCreight’75]

Move UpMaximum Y

Space: O(n) Update:

Update: O(log n)

Pointer Machine

Konstantinos Tsakalidis

11

Query by X-Coordinate: logn + t

PathSubtreesInX( s)

Pointer Machine

O(logn)

Konstantinos Tsakalidis

12

Query by Y-Coordinate: logn + t

u

ul

ur

[Alstrup, Brodal, Rauhe ‘00]1D Range Maximum Queries (Children)

uFind next pointto be reportedin O(1) timeO(1) time

Pointer Machineword-RAM

Konstantinos Tsakalidis

13

[ISAAC ‘09]

Update:O(log log n) exp. amo.Query: O(log log n+t) exp. w.h.p.Space: O(n)

Weighti=Θ(22i)

O(loglogn) expected w.h.p.[Mehlhorn, Tsakalidis ’93,Kaporis et al. ’06]

[Anderss

on, Thoru

p ‘07]

RMQ

O(1) expected amortized

word-RAM

Konstantinos Tsakalidis

14

I/O Model Space Query I/Os Update I/Os

[ISAAC‘09]Expected w.h.p. Amortized Expected

Average-Case EfficientDynamic 3-Sided Range Reporting

Space Query Time Update Time

[ISAAC ’09]Expected w.h.p.

Expected amortized

word-RAM

X: smooth

Konstantinos Tsakalidis

15

Overview

Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Konstantinos Tsakalidis

16

Orthogonal Range MAXIMA Reporting QueriesOR “Generalized Planar SKYLINE Operator”

Dominance Maxima QueriesReport all maximal points among

points with x in [xl,+∞) and y in [yb,+∞)

Contour Maxima QueriesReport all maximal points among points with x in (-∞, xl]

3-Sided Maxima QueriesReport all maximal points among

points with x in [xl, xr] and y in [yb,+∞)

Salary

Age

Employees

4-Sided Maxima QueriesReport all maximal points among

points with x in [xl, xr] and y in [yb,yt]

Interesting Points Oldest and Best PayedMaximal Point

Dominates:Is “Above”

Is NOTDominated

xl

yb

xl

yb

xr

yb

xl xl xr

yb

yt

Konstantinos Tsakalidis

17

Worst-Case EfficientDynamic Range MAXIMA Reporting

Pointer Machine Insert Delete

Overmars, van Leeuwen ‘81 logn + t - log2n log2n

Frederickson, Rodger ‘90 logn + t log2n+tlogn(1+t)

logn log2n

Janardan ‘91 logn + t logn + t logn log2n

Kapoor ‘00 logn + t amo. - logn logn

[ICALP ’11] logn + t logn + t logn logn

word-RAM Insert Delete

[ICALP ’11]

Konstantinos Tsakalidis

18

Tournament Tree

Copy UpMaximum Y

Y-Winning Paths

Pointer Machine

Konstantinos Tsakalidis

19

Tournament Tree

Right(u)MAX( )uPointer Machine Find next point

to be reportedin O(1) time

Konstantinos Tsakalidis

20

3-Sided Range Maxima Queries

Query Time: log n + tMAX( )

Pointer Machine

Subtrees(Paths)

O(logn)

Konstantinos Tsakalidis

21

Update OperationPointer MachinePrevious Update: O(log2n)

Konstantinos Tsakalidis

22

U

URUL

Update OperationPointer Machine

MAX(Right(uR))

MAX(Right(u))

MAX(Right(uL))[Sundar ‘89]Priority Queue with AttritionO(1) time

Konstantinos Tsakalidis

23

Reco

nst

ruct R

ollb

ack

Update OperationPointer Machine

Partially Perstistent Priority Queue with Attrition

O(1) time, space overhead per update step

[Brodal ‘96]

worst case

[Driscol et al. ‘89]

amortized

Space:O(n)Update:O(logn)

Konstantinos Tsakalidis

24

[ICALP ‘11]

[ICALP ’11] Space Insert Delete

Pointer Machine n logn+t logn logn

word-RAM n

Pointer Machine nlogn log2n+t log2n log2n

[ICALP ’11] Space Insert Delete

Konstantinos Tsakalidis

25

Rectangular Visibility Queries

4x

(+∞,+∞)

(+∞,-∞)

(-∞,+∞)

(-∞,-∞)

Proximity Queries/Similarity Search

4-Sided Range Maxima Queries

Konstantinos Tsakalidis

26

Worst-Case Efficient4-Sided Range MAXIMA Reporting and Rectangular Visibility Queries

Pointer Machine Space Insert Delete

Overmars, Wood ‘88 nlogn log2n+t log2n log3n

Overmars, Wood ‘88 nlogn log2n +t logn log2n log2n

[ICALP ’11] nlogn log2n+t log2n log2n

Konstantinos Tsakalidis

27

Overview

Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Konstantinos Tsakalidis

28

B-Trees [Bayer,McCreight ‘72]Name Age Salary …

Andreas 30 5.500 …

Maria 38 6.500 …

John 25 3.000 …

Helen 34 4.000 …

Jacob 28 7.000 …

Indexed Database

Space: O(N/B) blocksUpdate:O(logBN) I/Os

Access: O(logBN) I/Os

Multi-Versioned Databases

Btrfs

Data Platform

Konstantinos Tsakalidis

29

Fully Persistent B-Trees

I/O Model Space Query I/Os Update I/Os Amortized

Lanka, Mays ‘91 n/B (logBn + t/B)logBm logBn logBm

[SODA ’12] n/B logBn + t/B logBn + log2B

n elements in one versionm update operations = #versionsB block size

Konstantinos Tsakalidis

30

[SODA ‘12]

Incremental B-Trees Lazy Updates

O(logBN) READs O(1) WRITEs that make

O(1) changes to a block

ResultSpace O(N/B)Query O(logBN+t/B) I/Os

Update O(logBN + log2B) I/Os

I/O-Efficient Full Persistence Interface of Primitive Operations

READ WRITE

Input is a pointer-based Structure Node occupies O(1) blocks Node has indegree O(1)

O(1) I/O-Overhead per access to a block O(log2B) I/O-Overhead per change

to a block [Driscol et al.’89] Node-Splitting Method

ACCESS NEW_NODE

NEW_VERSION

Konstantinos Tsakalidis

31

Mange Tak

Konstantinos TsakalidisPh.D. Student

tsakalid@madalgo.au.dk

Tsakalidis K., et al.[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time”[ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” [ICALP ’11] “Dynamic Planar Range Maxima Queries”

[SODA ‘12] “Fully Persistent B-Trees”