Soft Heaps

32
Soft Heaps (a data structure created by Bernard Chazelle) 1 “Be soft, even if you stand to get squashed.” -- E. M. Forster

Transcript of Soft Heaps

Minimum Spanning Trees in O((m) m)

Soft Heaps(a data structure created by Bernard Chazelle)1Be soft, even if you stand to get squashed.-- E. M. Forster

OutlineSoft heapsWhat are they?Advantages vs. regular heapsSoft heap operations:sift, combine, updatesufmin, insert, deleteminAnalysis of running timeProof of correctnessApplications

2

2

Why soft heaps?(Because they are used in Chazelles MST algorithm)But whats wrong with regular heaps?The problem: they can be used to sortSorting takes (n log n) timeSo either insert(x) or delete-min() takes (log n)

3

How can we improve this LB?Inaccuracy:Soft heaps have an error parameter After n inserts, at most n elements may be corruptedBut nice amortized run times:delete-min() O(1)insert(x) O(log 1/)Note: with < 1/n, works just like ordinary heap.

So how do they work?4

Overview of soft heapsNew construction (Kaplan & Zwick, 2009)Linked list of heap-ordered binary trees: each tree has a rank, and one tree per rankEach tree also has a suffix-min pointer (black)501331289251711Trees are not necessarily balanced5, 7, 82, 10, 11Each node has list of elements

CorruptionEach node x has a list of elements, list[x]

All elements in list are indexed by the nodes keyIf their key is less, then they are corrupted(Note that corruption can only increase a key)The soft heap has at most n elements corrupted after n insertsThese elements move together (carpooling), which allows faster operations685, 7, 8

Some more notationFor a node x:left[x] is the left child of x, null if none existsright[x] is the right child of x, null if none existskey[x] is the key value of x, or if x = nullsize[x] is the target size of list[x] (defined later)For a root node x:suf-min[x] is the suffix-min pointer of xnext[x] is the next root in the linked listprev[x] is the previous root in the linked list

7

Soft heap operations: siftFor each node x, we want list[x] to have enough elements so that operations are fastThe sift(x) operation increases the number of elements in list[x]:If size[x] > |list[x]|:Let y be the child of x with smaller keyappend(list[x], list[y])Set key[x] = key[y]If y is a leaf, delete it, else call sift(y)8495, 7, 9

94, 5, 7, 9

Operations: combineIf we have two trees x, y of the same rank, we can combine(x, y) them into one tree:Create a new node z Set left[z] = x, right[z] = ySet rank[z] = rank[x] + 1Set size[z] appropriately (defined later)Call sift(z)91225518

1225518

5122518

Operations: updatesufminRecall: each root node x has a suffix-min pointerPoints to the smallest root from x to the end of the listThese may become invalid: updatesufmin(x) updates suf-min[x] and suf-min[y], for y xWe do this when we change the key of x (during sift), add a new root, or delete next[x]:If key[x] key[suf-min[next[x]]]:Set suf-min[x] = xElse set suf-min[x] = suf-min[next[x]]updatesufmin(prev[x])

10

Operations: insert, deleteminTo insert:We create a new rank 0 tree and add it to the list of rootsThen we call combine until there is only one root of each rankFinally, if x was the last root we modified, we call updatesufmin(x)To deletemin:Let z be the first root of the linked listThen set y = suf-min[z], and return an element of list[y]If |list[y]| < size[y]/2, call sift(y)

11

Example: insert1212013312892517115, 7, 82, 10, 111

Example: insert131301312892517115, 7, 82, 10, 1113

Example: insert14141312892517115, 7, 82, 10, 1113

Example: insert15151312892517115, 7, 82, 10, 1113

Example: insert16162312892517115, 7, 82, 10, 1131

Example: insert17172312892517115, 7, 82, 10, 1131

Example: deletemin182312892517117, 5, 82, 10, 113310

Example: deletemin192312892517115, 82, 10, 1133107

Lets say 2 < size[8]/2

Example: deletemin20231292517115, 8, 92, 10, 1133107

Note: 8 is now corrupted

Example: deletemin21231291125175, 8, 92, 10, 1133107

Example: deletemin2223129112517345, 8, 92, 10, 1133107

19, 34

Amortized AnalysisPotential analysis:Let M be the max rank of the heapEach internal node has potential of 1Each root node x has potential of: rank[x] + 5The heap itself has potential MFor each tree T, let del(T) be the number of deletions from T since the last siftWe also give T a potential of (r + 2)del(T)(r will be defined later)23

Analysis of combine(x, y)x, y root nodes with rank of kThe potentials of x, y decrease from 2k + 10 to 2. Creating a new node z increases the potential by k + 6The potential may increase by 1 if M increasesWe charge 1 cost for the operationAfter doing combines, we may have to do an updatesufmin, which costs kThis gives an amortized cost of 2-2k-10+k+6+2+k = 024

Analysis of sift(x)Let x be a node of rank k, y its child and we move the elements of list[y] to list[x]If |list[y]| < size[y]/2, then y is a leaf and it is deleted, so the potential decreases by 1, which pays for this operationOtherwiseWe have |list[y]| size[y]/2, so we charge each element 2/size[y] for the operationEach element can be charged at most once per rank, (in its whole history of existence)So the max cost of sift is: 25

What is this size[y] function?Consider: if size[y] = 1, then the cost of sift is just O(M), which will be O(log n) (also, no elements will be corrupted)So having size[y] > 1 is what makes soft heaps soft We want the sum from the previous slide to be O(log 1/)So we will make size[y] = 1 for ranks up to O(log 1/), and then exponentially increasingSo let r = C + log 1/, then we can define (for 0 < < 1):

With this definition, the previous sum indeed gives O(log 1/)26

Analysis of insertWe create a new root node (potential increased by 5)Every combine pays for itself, and the final updatesufmin Finally, the sift cost of the new element and the eventual delete cost will be O(log 1/) (see next slide)So insert takes O(log 1/) amortized27

Analysis of deleteminFinding the key takes 1 operationDeleting raises the potential by r+2 (so cost of r+3) we have charged this to nodes when they were insertedIf we have a leaf with more elements in its list or |list[x]| > size[x]/2, we do nothing(If it is a leaf and we delete it, then the potential is reduced by k+5)Otherwise: If we do sift, del(x) size[x]/2So the potential of the tree was at least (r + 2)sk/2Can show: (r + 2)sk/2 > k + 1Either way we have k+1 potential, which pays for the updatesufmin operation, so O(1) amortized28

Proof of correctnessLemma 1: The number of nodes of rank k is at most n/2k.Lemma 2: If x has rank k, size[x] 2(1 + )k-r Proof of both: Easy induction.

Lemma 3: |list[x]| (1 + 1/)size[x]Proof: By induction on the rank of x. If we move nodes from list[y] of lower rank, we have |list[y]| ((1 + )/)size[y] = size[x]/So the new size is size[x] + size[x]/ = (1 + 1/)size[x]29

29

Proof of correctness (contd)Theorem: After n insertions, at most n elements are corrupted.Proof:Elements can only be corrupted in nodes of rank > rFor a node of rank k, |list[x]| (1 + 1/)size[x] 2 (1 + 1/) (1 + )k-r Since r = C + log 1/ the # of corrupted elements is at most:30

So we choose C such that this is < n(We needed < 1 for the sum to converge)

Applications of Soft HeapsSelection in O(n) time:Make a soft heap with = 1/3:Insert all the elements, then remove n/3 + 1 elementsLet x be the largest element removedx is greater than 1/3 of the elements, and it is less than 1/3 of the elements (because only 1/3 could be corrupted)So partition the elements around x, in the worst case you are left with (2/3)n elements, then we continue recursivelyRunning time: O(n + (2/3)n + (4/9)n + ) = O(n)More interesting application:Finding the MST in O((m, n) m) time31

The End(I hope you have enjoyed soft heaps.)32