Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based...

33
Sorting I Given is a sequence of pairs (k 1 , e 1 ), (k 2 , e 2 ),..., (k n , e n ) of elements e i with keys k i and an order on the keys. 4 e 1 2 e 2 3 e 3 1 e 4 5 e 5 I We search a permutation π of the pairs such that the keys k π(1) k π(2) ... k π(n) are in ascending order. 1 e 4 2 e 2 3 e 3 4 e 1 5 e 5

Transcript of Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based...

Page 1: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Sorting

I Given is a sequence of pairs

(k1, e1), (k2, e2), . . . , (kn, en)

of elements ei with keys ki and an order ≤ on the keys.

4

e1

2

e2

3

e3

1

e4

5

e5

I We search a permutation π of the pairs such that the keyskπ(1) ≤ kπ(2) ≤ . . . ≤ kπ(n) are in ascending order.

1

e4

2

e2

3

e3

4

e1

5

e5

Page 2: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Selection-Sort

Selection-sort takes an unsorted list A and sorts as follows:I search the smallest element A and swap it with the first

I afterwards continue sorting the remainder of A

5 4 3 7 1

1 4 3 7 5

1 3 4 7 5

1 3 4 7 5

1 3 4 5 7

1 3 4 5 7

Unsorted Part

Sorted Part

Minimal Element

Page 3: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Selection-Sort: Properties

I Time complexity (best, average and worst-case) O(n2):

n + (n − 1) + . . .+ 1 =n2 + n

2∈ O(n2)

(caused by searching the minimal element)

I Selection-sort is an in-place sorting algorithm.

In-Place AlgorithmAn algorithm is in-place if apart from space for input data onlya constant amount of space is used: space complexity O(1).

Page 4: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Stable Sorting Algorithms

Stable Sorting AlgorithmA sorting algorithm is called stable if the order of items withequal key is preserved.

Example: not stable

3 2 2

A B C2 2 3

C B A2 2 3

C B AB, C exchanged order,although they have equal keys

Example: stable

3 2 2

A B C2 3 2

B A C2 2 3

B C A

Page 5: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Stable Sorting Algorithms

Applications of stable sorting:I preserving original order of elements with equal key

For example:I we have an alphabetically sorted list of names

I we want to sort by date of birth while keeping alphabeticalorder for persons with same birthday

Selection-sort is stable ifI we always select the first (leftmost) minimal element

Page 6: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Insertion-Sort

Selection-sort takes an unsorted list A and sorts as follows:I distinguishes a sorted and unsorted part of A

I in each step we remove an element from the unsortedpart, and insert it at the correct position in the sorted part

5 3 4 7 1

5 3 4 7 1

3 5 4 7 1

3 4 5 7 1

3 4 5 7 1

1 3 4 5 7

Unsorted Part

Sorted Part

Minimal Element

Page 7: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Insertion-Sort: Complexity

I Time complexity worst-case O(n2):

1 + 2 + . . .+ n =n2 + n

2∈ O(n2)

(searching insertion position together with inserting)I Time complexity best-case O(n):

I if list is already sorted, and

I we start searching insertion position from the end

I More general: time complexity O(n · (n − d + 1))

I if the first d elements are already sorted

I Insertion-sort is an in-place sorting algorithm:I space complexity O(1)

Page 8: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Insertion-Sort: Properties

I Simple implementation.I Efficient for:

I small lists

I big lists of which a large prefix is already sorted

I Insertion-sort is stable if:I we always pick the first element from the unsorted part

I we always insert behind all elements with equal keys

I Insertion-sort can be used online:I does not need all data at once,

I can sort a list while receiving it

Page 9: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heaps

A heap is a binary tree storing keys at its inner nodes such that:

I if A is a parent of B, then key(A) ≤ key(B)

I the heap is a complete binary tree: let h be the heap heightI for i = 0, . . . , h − 1 there are 2i nodes at depth i

I at depth h − 1 the inner nodes are left of the external nodes

We call the rightmost inner node at depth h − 1 the ‘last node’.

2

5

9 7

6

last node

Page 10: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Height of Heaps

TheoremA heap storing n keys height O(log2 n).

Proof.A heap of height h contains 2i nodes at every depthi = 0, . . . , h − 2 and at least one node at depth h − 1. Thusn ≥ 1 + 2 + 4 + . . . 2h−2 + 1 = 2h−1. Hence h ≤ 1 + log2 n.

10

21

2ii

1h − 1

depth keys

Page 11: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heaps and Priority Queues

We can use a heap to implement a priority queue:I inner nodes store (key, element) pair

I variable last points to the last node

(2,Sue)(5,Pat) (6,Mark)

(7,Anna)

(9, Jeff )

last

I For convenience, in the sequel, we only show the keys.

Page 12: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heaps: Insertion

The insertion of key k into the heap consits of 3 steps:I Find the insertion node z (the new last node).

I Expand z to an internal node and store k at z.

I Restore the heap-order property (see following slides).

2

5

9 7

6

insertion node z

Example: insertion of key 1 (without restoring heap-order)

2

5

9 7

6

1z

Page 13: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heaps: Insertion, Upheap

After insertion of k the heap-order may be violated.

We restore the heap-order using the upheap algorithm:I we swap k upwards along the path to the root

as long as the parent of k has a larger keyTime complexity is O(log2 n) since the heap height is O(log2 n).

2

5

9 7

6

1

Now the heap-order property is restored.

Page 14: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heaps: Insertion, Upheap

After insertion of k the heap-order may be violated.

We restore the heap-order using the upheap algorithm:I we swap k upwards along the path to the root

as long as the parent of k has a larger keyTime complexity is O(log2 n) since the heap height is O(log2 n).

1

5

9 7

2

6

Now the heap-order property is restored.

Page 15: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heaps: Insert, Finding the Insertion Position

An algorithm for finding the insertion position (new last node):I start from the current last node

I while the current node is a right child, go to the parent node

I if the current node is a left child, go to the right child

I while the current node has a left child, go to the left childTime complexity is O(log2 n) since the heap height is O(log2 n).(we walk at most at most once completely up and down again)

Page 16: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heaps: Removal of the Root

The removal of the root consits of 3 steps:I Replace the root key with the key of the last node w.

I Compress w and its children into a leaf.

I Restore the heap-order property (see following slides).

2

5

9 7

6w

Example: removal of the root (without restoring heap-order)

7

5

9

6w

Page 17: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heaps: Removal, Downheap

Replacing the root key by k may violate the heap-order.

We restore the heap-order using the downheap algorithm:I we swap k with its smallest child

as long as a child of k has a smaller keyTime complexity is O(log2 n) since the heap height is O(log2 n).

7

5

9

6

Now the heap-order property is restored. The new last nodecan be found similar to finding the insertion position (but nowwalk against the clock direction).

Page 18: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heaps: Removal, Downheap

Replacing the root key by k may violate the heap-order.

We restore the heap-order using the downheap algorithm:I we swap k with its smallest child

as long as a child of k has a smaller keyTime complexity is O(log2 n) since the heap height is O(log2 n).

5

7

9

6

Now the heap-order property is restored. The new last nodecan be found similar to finding the insertion position (but nowwalk against the clock direction).

Page 19: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heaps: Removal, Finding the New Last Node

After the removal we have to find the new last node:I start from the old last node (which has been remove)

I while the current node is a left child, go to the parent node

I if the current node is a right child, go to the left child

I while the current node has an right child which is not a leaf,go to the right child

Time complexity is O(log2 n) since the heap height is O(log2 n).(we walk at most at most once completely up and down again)

removed

Page 20: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Heap-Sort

We implement a priority queue by means of a heap:I insertItem(k, e) corresponds to adding (k, e) to the heap

I removeMin() corresponds to removing the root of the heapPerformance:

I insertItem(k, e), and removeMin() run in O(log2 n) time

I size(), isEmtpy(), minKey(), and minElement() are O(1)

Heap-sort is O(n log2 n)

Using a heap-based priority queue we can sort a list of nelements in O(n · log2 n) time (n times insert + n times removal).

Thus heap-sort is much faster than quadratic sorting algorithms(e.g. selection sort).

Page 21: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Vector-based Heap Implementation

We can represent a heap with n keys by a vector of size n + 1:

2

5

9 7

6

0 1 2 3 4 52 5 6 9 7

I The root node has rank 1 (cell at rank 0 is not used).I For a node at rank i :

I the left child is at rank 2i

I the right child is at rank 2i + 1

I the parent (if i > 1) is located at rank bi/2c

I Leafs and links between the nodes are not stored explicitly.

Page 22: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Vector-based Heap Implementation, continued

We can represent a heap with n keys by a vector of size n + 1:

2

5

9 7

6

0 1 2 3 4 52 5 6 9 7

I The last element in the heap has rank n, thus:I insertItem corresponds to inserting at rank n + 1

I removeMin corresponds to removing at rank n

I Yields in-place heap-sort (space complexity O(1)):I uses a max-heap (largest element on top)

Page 23: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Merging two Heaps

We are given two heaps h1, h2 and a key k:I create a new heap with root k and h1, h2 as children

I we perform downheap to restore the heap-order

2

6 5

h1 3

4 5

h2 k = 7

7

2

6 5

3

4 5

Page 24: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Merging two Heaps

We are given two heaps h1, h2 and a key k:I create a new heap with root k and h1, h2 as children

I we perform downheap to restore the heap-order

2

6 5

h1 3

4 5

h2 k = 7

2

5

6 7

3

4 5

Page 25: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Bottom-up Heap Construction

We have n keys and want to construct a heap from them.

Possibility one:I start from empty heap and use n times insert

I needs O(n log2 n) time

Possibility two: bottom-up heap constructionI for simplicity we assume n = 2h − 1 (for some h)

I take 2h−1 elements and turn them into heaps of size 1I for phase i = 1, . . . , log2 n:

I merge the heaps of size 2i − 1 to heaps of size 2i+1 − 1

2i − 1 2i − 1merge

2i − 1 2i − 1

2i+1 − 1

Page 26: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Bottom-up Heap Construction, Example

We construct a heap from the following 24 − 1 = 15 elements:

16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

16 15 4 12 6 9 23 20

Page 27: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Bottom-up Heap Construction, Example

We construct a heap from the following 24 − 1 = 15 elements:

16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

25

16 15

5

4 12

11

6 9

27

23 20

Page 28: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Bottom-up Heap Construction, Example

We construct a heap from the following 24 − 1 = 15 elements:

16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

15

16 25

4

5 12

6

11 9

20

23 27

Page 29: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Bottom-up Heap Construction, Example

We construct a heap from the following 24 − 1 = 15 elements:

16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

7

15

16 25

4

5 12

8

6

11 9

20

23 27

Page 30: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Bottom-up Heap Construction, Example

We construct a heap from the following 24 − 1 = 15 elements:

16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

4

15

16 25

5

7 12

6

8

11 9

20

23 27

Page 31: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Bottom-up Heap Construction, Example

We construct a heap from the following 24 − 1 = 15 elements:

16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

10

4

15

16 25

5

7 12

6

8

11 9

20

23 27

We are ready: this is the final heap.

Page 32: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Bottom-up Heap Construction, Example

We construct a heap from the following 24 − 1 = 15 elements:

16,15,4,12,6,9,23,20,25,5,11,27,7,8,10

4

5

15

16 25

7

10 12

6

8

11 9

20

23 27

We are ready: this is the final heap.

Page 33: Sortingtcs/ds/lecture4.pdf · 2009. 10. 13. · Heap-sort is O(nlog 2 n) Using a heap-based priority queue we can sort a list of n elements in O(nlog 2 n) time (n times insert + n

Bottom-up Heap Construction, Performance

Visualization of the worst-case of the construction:

I displays the longest possible heapdown paths(may not be the actual path, but maximal length)

I each edge is traversed at most once

I we have 2n edges hence the time complexity is O(n)

I faster than n successive insertions