Κωνσταντίνος Τσακαλίδης
description
Transcript of Κωνσταντίνος Τσακαλίδης
Konstantinos Tsakalidis1
Dynamic Data Structures:Orthogonal Range Queries
and Update Efficiency
Konstantinos Tsakalidis
PhD Defense23 September 2011
Konstantinos Tsakalidis2
Κωνσταντίνος Τσακαλίδης
2000-2006 B. Eng. Computer Engineering and Informatics Dpt., University of Patras, Greece
Sum. 2007 InternGoogle Inc., Mountain View, California, USA
2007-2009 Ph. D. Student (Part A)MADALGO, Aarhus University, Denmark
Sum. 2010 Visiting Prof. Ian Munro D. Cheriton School of Computer Science, University of Waterloo, Canada
2009-2011 Ph. D. Student (Part B)
Konstantinos Tsakalidis3
Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries
[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”
Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”
Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”
Konstantinos Tsakalidis4
Databases and GeometryName Age Salary Date Phone …
Andreas 30 5.500 2/2010 555-4321 …Maria 6.500 4/1998 555-3214 …John 25 3.000 5/2011 555-2143 …
Helen 34 4.000 1/2000 555-1432 …Jacob 28 7.000 11/1989 555-1234 …
Planar (D=2) Euclidean Space
38
Query Operation• Question about stored dataUpdate Operation/Transaction• Insert/Delete Tuple• Change Value
N points D dimensions
29
Salary
Age
…
Date
NamePhone
Konstantinos Tsakalidis5
Models of Computation
Pointer Machine
RecordO(1) fields
word-RAM I/O Model[Aggarwal, Vitter ‘88]
Space
w bits/cell
O(1) Time
N M<NN
B
B words
N/B
M/B
I/O Operation
#Occupied Records
#Arithmetic Operations +#Pointer TraversalsTime
#Occupied Cells
#Arithmetic Operations+#cell READ/WRITEs
#Occupied Blocks
#I/O Operations specialized database
Memory Disk
Konstantinos Tsakalidis6
Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries
[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”
Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”
Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”
Konstantinos Tsakalidis7
Orthogonal Range Reporting Queries
Salary
Age
1000
Contour Query Report all points with: Salary > 1000
Dominance Query Report all points with: Salary > 1000 and Age > 35
35
2000
3-Sided Query Report all points with: 2000 > Salary > 1000 and Age > 35
Employees
Konstantinos Tsakalidis8
I/O Model Space Query I/Os Update I/Os
External Priority Search Tree [Arge’99]
amo.
[ICDT ’10] Amortized Expected w.h.p.
[ICDT ’10]
Expected w.h.p.
Amortized Expected w.h.p.
[ISAAC‘09]Expected w.h.p. Amortized Expected
[ISAAC ’09] Expected w.h.p. Expected amortized
Worst-Case EfficientDynamic 3-Sided Range Reporting
word-RAM Space Query Time Update Time
Fusion Tree [Willard’00]
[Mortensen’06]
I/O Model Space Query I/Os Update I/Os
External Priority Search Tree [Arge’99]
amo.
Space Query Time Update Time
Priority Search Tree[McCreight’85]
Pointer Machineword-RAM
[ICDT ’10] Expected w.h.p.
[ICDT ’10]Expected w.h.p.
Expected w.h.p.
X, Y: μ-randomX: smoothY: restrictedX: smooth
X, Y: μ-random
X: smoothY: restricted
X: smooth
Average-Case EfficientDynamic 3-Sided Range Reporting
Konstantinos Tsakalidis9
Unknown non-changing μ-Random probabilistic distribution (f,g)-Smooth distribution
Not exceed a specific bound, no matter how small subinterval Includes regular, uniform distributions Any distribution is (f,Θ(n))-smooth
Restricted class of distributions Few elements occur very often Many elements occur rarely Zipfian, Power Law Distributions
Probabilistic Distributions
Smooth
Restricted
Konstantinos Tsakalidis10
Priority Search Tree [McCreight’75]
Move UpMaximum Y
Space: O(n) Update:Update: O(log n)
Pointer Machine
Konstantinos Tsakalidis11
Query by X-Coordinate: logn + tPathSubtreesInX( s)
Pointer Machine
O(logn)
Konstantinos Tsakalidis12
Query by Y-Coordinate: logn + t
uul
ur
[Alstrup, Brodal, Rauhe ‘00]1D Range Maximum Queries (Children)
uFind next pointto be reportedin O(1) timeO(1) time
Pointer Machineword-RAM
Konstantinos Tsakalidis13
[ISAAC ‘09]Update:O(log log n) exp. amo.Query: O(log log n+t) exp. w.h.p.Space: O(n)
Weighti=Θ(22i)
O(loglogn) expected w.h.p.[Mehlhorn, Tsakalidis ’93,Kaporis et al. ’06]
[Andersson, Thorup ‘07]
RMQ
O(1) expected amortized
word-RAM
Konstantinos Tsakalidis14
I/O Model Space Query I/Os Update I/Os
[ISAAC‘09]Expected w.h.p. Amortized Expected
Average-Case EfficientDynamic 3-Sided Range Reporting
Space Query Time Update Time
[ISAAC ’09]Expected w.h.p.
Expected amortized
word-RAM
X: smooth
Konstantinos Tsakalidis15
Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries
[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”
Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”
Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”
Konstantinos Tsakalidis16
Orthogonal Range MAXIMA Reporting QueriesOR “Generalized Planar SKYLINE Operator”
Dominance Maxima QueriesReport all maximal points among
points with x in [xl,+∞) and y in [yb,+∞) Contour Maxima Queries
Report all maximal points among points with x in (-∞, xl]
3-Sided Maxima QueriesReport all maximal points among points with x in [xl, xr] and y in [yb,+∞)
Salary
Age
Employees
4-Sided Maxima QueriesReport all maximal points among points with x in [xl, xr] and y in [yb,yt]
Interesting Points Oldest and Best PayedMaximal Point
Dominates:Is “Above”
Is NOTDominated
xl
yb
xl
yb
xr
yb
xl xl xr
yb
yt
Konstantinos Tsakalidis17
Worst-Case EfficientDynamic Range MAXIMA Reporting
Pointer Machine Insert Delete
Overmars, van Leeuwen ‘81 logn + t - log2n log2nFrederickson, Rodger ‘90 logn + t log2n+t
logn(1+t)logn log2n
Janardan ‘91 logn + t logn + t logn log2nKapoor ‘00 logn + t amo. - logn logn
[ICALP ’11] logn + t logn + t logn logn
word-RAM Insert Delete
[ICALP ’11]
Konstantinos Tsakalidis18
Tournament TreeCopy UpMaximum YY-Winning Paths
Pointer Machine
Konstantinos Tsakalidis19
Tournament TreeRight(u)MAX( )u
Pointer Machine Find next pointto be reportedin O(1) time
Konstantinos Tsakalidis20
3-Sided Range Maxima QueriesQuery Time: log n + tMAX( )
Pointer Machine
Subtrees(Paths)
O(logn)
Konstantinos Tsakalidis21
Update OperationPointer MachinePrevious Update: O(log2n)
Konstantinos Tsakalidis22
U
URUL
Update OperationPointer Machine
MAX(Right(uR))
MAX(Right(u))
MAX(Right(uL))[Sundar ‘89]Priority Queue with AttritionO(1) time
Konstantinos Tsakalidis23
Reco
nstr
uct Rollback
Update OperationPointer Machine
Partially Perstistent Priority Queue with Attrition
O(1) time, space overhead per update step
[Brodal ‘96]worst case
[Driscol et al. ‘89]amortized
Space:O(n)Update:O(logn)
Konstantinos Tsakalidis24
[ICALP ‘11][ICALP ’11] Space Insert Delete
Pointer Machine n logn+t logn lognword-RAM n
Pointer Machine nlogn log2n+t log2n log2n
[ICALP ’11] Space Insert Delete
Konstantinos Tsakalidis25
Rectangular Visibility Queries
4x
(+∞,+∞)
(+∞,-∞)
(-∞,+∞)
(-∞,-∞)
Proximity Queries/Similarity Search
4-Sided Range Maxima Queries
Konstantinos Tsakalidis26
Worst-Case Efficient4-Sided Range MAXIMA Reporting and Rectangular Visibility Queries
Pointer Machine Space Insert Delete
Overmars, Wood ‘88 nlogn log2n+t log2n log3nOvermars, Wood ‘88 nlogn log2n +t logn log2n log2n
[ICALP ’11] nlogn log2n+t log2n log2n
Konstantinos Tsakalidis27
Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries
[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”
Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries”
Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”
Konstantinos Tsakalidis28
B-Trees [Bayer,McCreight ‘72]Name Age Salary …
Andreas 30 5.500 …Maria 38 6.500 …John 25 3.000 …
Helen 34 4.000 …Jacob 28 7.000 …
Indexed Database
Space: O(N/B) blocksUpdate:O(logBN) I/OsAccess: O(logBN) I/Os
Multi-Versioned Databases Btrfs Data Platform
Konstantinos Tsakalidis29
Fully Persistent B-Trees
I/O Model Space Query I/Os Update I/Os Amortized
Lanka, Mays ‘91 n/B (logBn + t/B)logBm logBn logBm[SODA ’12] n/B logBn + t/B logBn + log2B
n elements in one versionm update operations = #versionsB block size
Konstantinos Tsakalidis30
[SODA ‘12]Incremental B-Trees Lazy Updates
O(logBN) READs O(1) WRITEs that make
O(1) changes to a block
ResultSpace O(N/B)Query O(logBN+t/B) I/OsUpdate O(logBN + log2B) I/Os
I/O-Efficient Full Persistence Interface of Primitive Operations
READ WRITE
Input is a pointer-based Structure Node occupies O(1) blocks Node has indegree O(1)
O(1) I/O-Overhead per access to a block O(log2B) I/O-Overhead per change
to a block [Driscol et al.’89] Node-Splitting Method
ACCESS NEW_NODE
NEW_VERSION
Konstantinos Tsakalidis31
Mange TakKonstantinos Tsakalidis
Ph.D. [email protected]
Tsakalidis K., et al.[ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time”[ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” [ICALP ’11] “Dynamic Planar Range Maxima Queries”[SODA ‘12] “Fully Persistent B-Trees”