Κωνσταντίνος Τσακαλίδης

31
Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

description

Dynamic Data Structures : Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011. Κωνσταντίνος Τσακαλίδης. 2000-2006 B. Eng. Computer Engineering and Informatics Dpt., University of Patras , Greece Sum. 2007 Intern - PowerPoint PPT Presentation

Transcript of Κωνσταντίνος Τσακαλίδης

Page 1: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis1

Dynamic Data Structures:Orthogonal Range Queries

and Update Efficiency

Konstantinos Tsakalidis

PhD Defense23 September 2011

Page 2: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis2

Κωνσταντίνος Τσακαλίδης

2000-2006 B. Eng. Computer Engineering and Informatics Dpt., University of Patras, Greece

Sum. 2007 InternGoogle Inc., Mountain View, California, USA

2007-2009 Ph. D. Student (Part A)MADALGO, Aarhus University, Denmark

Sum. 2010 Visiting Prof. Ian Munro D. Cheriton School of Computer Science, University of Waterloo, Canada

2009-2011 Ph. D. Student (Part B)

Page 3: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis3

Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries

[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Page 4: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis4

Databases and GeometryName Age Salary Date Phone …

Andreas 30 5.500 2/2010 555-4321 …Maria 6.500 4/1998 555-3214 …John 25 3.000 5/2011 555-2143 …

Helen 34 4.000 1/2000 555-1432 …Jacob 28 7.000 11/1989 555-1234 …

Planar (D=2) Euclidean Space

38

Query Operation• Question about stored dataUpdate Operation/Transaction• Insert/Delete Tuple• Change Value

N points D dimensions

29

Salary

Age

Date

NamePhone

Page 5: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis5

Models of Computation

Pointer Machine

RecordO(1) fields

word-RAM I/O Model[Aggarwal, Vitter ‘88]

Space

w bits/cell

O(1) Time

N M<NN

B

B words

N/B

M/B

I/O Operation

#Occupied Records

#Arithmetic Operations +#Pointer TraversalsTime

#Occupied Cells

#Arithmetic Operations+#cell READ/WRITEs

#Occupied Blocks

#I/O Operations specialized database

Memory Disk

Page 6: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis6

Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries

[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Page 7: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis7

Orthogonal Range Reporting Queries

Salary

Age

1000

Contour Query Report all points with: Salary > 1000

Dominance Query Report all points with: Salary > 1000 and Age > 35

35

2000

3-Sided Query Report all points with: 2000 > Salary > 1000 and Age > 35

Employees

Page 8: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis8

I/O Model Space Query I/Os Update I/Os

External Priority Search Tree [Arge’99]

amo.

[ICDT ’10] Amortized Expected w.h.p.

[ICDT ’10]

Expected w.h.p.

Amortized Expected w.h.p.

[ISAAC‘09]Expected w.h.p. Amortized Expected

[ISAAC ’09] Expected w.h.p. Expected amortized

Worst-Case EfficientDynamic 3-Sided Range Reporting

word-RAM Space Query Time Update Time

Fusion Tree [Willard’00]

[Mortensen’06]

I/O Model Space Query I/Os Update I/Os

External Priority Search Tree [Arge’99]

amo.

Space Query Time Update Time

Priority Search Tree[McCreight’85]

Pointer Machineword-RAM

[ICDT ’10] Expected w.h.p.

[ICDT ’10]Expected w.h.p.

Expected w.h.p.

X, Y: μ-randomX: smoothY: restrictedX: smooth

X, Y: μ-random

X: smoothY: restricted

X: smooth

Average-Case EfficientDynamic 3-Sided Range Reporting

Page 9: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis9

Unknown non-changing μ-Random probabilistic distribution (f,g)-Smooth distribution

Not exceed a specific bound, no matter how small subinterval Includes regular, uniform distributions Any distribution is (f,Θ(n))-smooth

Restricted class of distributions Few elements occur very often Many elements occur rarely Zipfian, Power Law Distributions

Probabilistic Distributions

Smooth

Restricted

Page 10: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis10

Priority Search Tree [McCreight’75]

Move UpMaximum Y

Space: O(n) Update:Update: O(log n)

Pointer Machine

Page 11: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis11

Query by X-Coordinate: logn + tPathSubtreesInX( s)

Pointer Machine

O(logn)

Page 12: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis12

Query by Y-Coordinate: logn + t

uul

ur

[Alstrup, Brodal, Rauhe ‘00]1D Range Maximum Queries (Children)

uFind next pointto be reportedin O(1) timeO(1) time

Pointer Machineword-RAM

Page 13: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis13

[ISAAC ‘09]Update:O(log log n) exp. amo.Query: O(log log n+t) exp. w.h.p.Space: O(n)

Weighti=Θ(22i)

O(loglogn) expected w.h.p.[Mehlhorn, Tsakalidis ’93,Kaporis et al. ’06]

[Andersson, Thorup ‘07]

RMQ

O(1) expected amortized

word-RAM

Page 14: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis14

I/O Model Space Query I/Os Update I/Os

[ISAAC‘09]Expected w.h.p. Amortized Expected

Average-Case EfficientDynamic 3-Sided Range Reporting

Space Query Time Update Time

[ISAAC ’09]Expected w.h.p.

Expected amortized

word-RAM

X: smooth

Page 15: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis15

Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries

[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Page 16: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis16

Orthogonal Range MAXIMA Reporting QueriesOR “Generalized Planar SKYLINE Operator”

Dominance Maxima QueriesReport all maximal points among

points with x in [xl,+∞) and y in [yb,+∞) Contour Maxima Queries

Report all maximal points among points with x in (-∞, xl]

3-Sided Maxima QueriesReport all maximal points among points with x in [xl, xr] and y in [yb,+∞)

Salary

Age

Employees

4-Sided Maxima QueriesReport all maximal points among points with x in [xl, xr] and y in [yb,yt]

Interesting Points Oldest and Best PayedMaximal Point

Dominates:Is “Above”

Is NOTDominated

xl

yb

xl

yb

xr

yb

xl xl xr

yb

yt

Page 17: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis17

Worst-Case EfficientDynamic Range MAXIMA Reporting

Pointer Machine Insert Delete

Overmars, van Leeuwen ‘81 logn + t - log2n log2nFrederickson, Rodger ‘90 logn + t log2n+t

logn(1+t)logn log2n

Janardan ‘91 logn + t logn + t logn log2nKapoor ‘00 logn + t amo. - logn logn

[ICALP ’11] logn + t logn + t logn logn

word-RAM Insert Delete

[ICALP ’11]

Page 18: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis18

Tournament TreeCopy UpMaximum YY-Winning Paths

Pointer Machine

Page 19: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis19

Tournament TreeRight(u)MAX( )u

Pointer Machine Find next pointto be reportedin O(1) time

Page 20: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis20

3-Sided Range Maxima QueriesQuery Time: log n + tMAX( )

Pointer Machine

Subtrees(Paths)

O(logn)

Page 21: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis21

Update OperationPointer MachinePrevious Update: O(log2n)

Page 22: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis22

U

URUL

Update OperationPointer Machine

MAX(Right(uR))

MAX(Right(u))

MAX(Right(uL))[Sundar ‘89]Priority Queue with AttritionO(1) time

Page 23: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis23

Reco

nstr

uct Rollback

Update OperationPointer Machine

Partially Perstistent Priority Queue with Attrition

O(1) time, space overhead per update step

[Brodal ‘96]worst case

[Driscol et al. ‘89]amortized

Space:O(n)Update:O(logn)

Page 24: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis24

[ICALP ‘11][ICALP ’11] Space Insert Delete

Pointer Machine n logn+t logn lognword-RAM n

Pointer Machine nlogn log2n+t log2n log2n

[ICALP ’11] Space Insert Delete

Page 25: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis25

Rectangular Visibility Queries

4x

(+∞,+∞)

(+∞,-∞)

(-∞,+∞)

(-∞,-∞)

Proximity Queries/Similarity Search

4-Sided Range Maxima Queries

Page 26: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis26

Worst-Case Efficient4-Sided Range MAXIMA Reporting and Rectangular Visibility Queries

Pointer Machine Space Insert Delete

Overmars, Wood ‘88 nlogn log2n+t log2n log3nOvermars, Wood ‘88 nlogn log2n +t logn log2n log2n

[ICALP ’11] nlogn log2n+t log2n log2n

Page 27: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis27

Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries

[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”

Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”

Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”

Page 28: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis28

B-Trees [Bayer,McCreight ‘72]Name Age Salary …

Andreas 30 5.500 …Maria 38 6.500 …John 25 3.000 …

Helen 34 4.000 …Jacob 28 7.000 …

Indexed Database

Space: O(N/B) blocksUpdate:O(logBN) I/OsAccess: O(logBN) I/Os

Multi-Versioned Databases Btrfs Data Platform

Page 29: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis29

Fully Persistent B-Trees

I/O Model Space Query I/Os Update I/Os Amortized

Lanka, Mays ‘91 n/B (logBn + t/B)logBm logBn logBm[SODA ’12] n/B logBn + t/B logBn + log2B

n elements in one versionm update operations = #versionsB block size

Page 30: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis30

[SODA ‘12]Incremental B-Trees Lazy Updates

O(logBN) READs O(1) WRITEs that make

O(1) changes to a block

ResultSpace O(N/B)Query O(logBN+t/B) I/OsUpdate O(logBN + log2B) I/Os

I/O-Efficient Full Persistence Interface of Primitive Operations

READ WRITE

Input is a pointer-based Structure Node occupies O(1) blocks Node has indegree O(1)

O(1) I/O-Overhead per access to a block O(log2B) I/O-Overhead per change

to a block [Driscol et al.’89] Node-Splitting Method

ACCESS NEW_NODE

NEW_VERSION

Page 31: Κωνσταντίνος  Τσακαλίδης

Konstantinos Tsakalidis31

Mange TakKonstantinos Tsakalidis

Ph.D. [email protected]

Tsakalidis K., et al.[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time”[ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” [ICALP ’11] “Dynamic Planar Range Maxima Queries”[SODA ‘12] “Fully Persistent B-Trees”