Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · ©...

62
Introduction to Algorithms 6.046J/18.401J Lecture 6 Prof. Piotr Indyk

Transcript of Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · ©...

Page 1: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms6.046J/18.401J

Lecture 6Prof. Piotr Indyk

Page 2: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.2© Charles E. Leiserson and Piotr Indyk

Today: sorting

• Show that Θ (n lg n) is the best possible running time for a sorting algorithm.

• Design an algorithm that sorts in O(n) time.• Hint: different models ?

Page 3: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.3© Charles E. Leiserson and Piotr Indyk

Comparison sortAll the sorting algorithms we have seen so far are comparison sorts: only use comparisons to determine the relative order of elements.• E.g., insertion sort, merge sort, quicksort,

heapsort.

Page 4: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.4© Charles E. Leiserson and Piotr Indyk

xx

Partitioning subroutinePARTITION(A, p, r) ⊳ A[p . . r]

x ← A[p] ⊳ pivot = A[p]i ← pfor j ← p + 1 to r

do if A[ j] ≤ xthen i ← i + 1

exchange A[i] ↔ A[ j]exchange A[p] ↔ A[i]return i

≤ x≤ x ≥ x≥ x ??p i rj

Invariant:

Page 5: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.5© Charles E. Leiserson and Piotr Indyk

Comparison sort

• All of our algorithms used comparisons• All of our algorithms have running time

Ω(n lg n)• Is it the best that we can do using just

comparisons ?• Answer: YES, via decision trees

Page 6: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.6© Charles E. Leiserson and Piotr Indyk

Decision-tree example

1:21:2

2:32:3

123123 1:31:3

132132 312312

1:31:3

213213 2:32:3

231231 321321

Each internal node is labeled i:j for i, j ∈ 1, 2,…, n.•The left subtree shows subsequent comparisons if ai ≤ aj.•The right subtree shows subsequent comparisons if ai ≥ aj.

Sort ⟨a1, a2, …, an⟩(n=3)

Page 7: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.7© Charles E. Leiserson and Piotr Indyk

Decision-tree example

1:21:2

2:32:3

123123 1:31:3

132132 312312

1:31:3

213213 2:32:3

231231 321321

Each internal node is labeled i:j for i, j ∈ 1, 2,…, n.•The left subtree shows subsequent comparisons if ai ≤ aj.•The right subtree shows subsequent comparisons if ai ≥ aj.

Sort ⟨a1, a2, a3⟩= ⟨ 9, 4, 6 ⟩:

Page 8: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.8© Charles E. Leiserson and Piotr Indyk

Decision-tree example

1:21:2

2:32:3

123123 1:31:3

132132 312312

1:31:3

213213 2:32:3

231231 321321

Each internal node is labeled i:j for i, j ∈ 1, 2,…, n.•The left subtree shows subsequent comparisons if ai ≤ aj.•The right subtree shows subsequent comparisons if ai ≥ aj.

9 ≥ 4Sort ⟨a1, a2, a3⟩= ⟨ 9, 4, 6 ⟩:

Page 9: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.9© Charles E. Leiserson and Piotr Indyk

Decision-tree example

1:21:2

2:32:3

123123 1:31:3

132132 312312

1:31:3

213213 2:32:3

231231 321321

Each internal node is labeled i:j for i, j ∈ 1, 2,…, n.•The left subtree shows subsequent comparisons if ai ≤ aj.•The right subtree shows subsequent comparisons if ai ≥ aj.

9 ≥ 6

Sort ⟨a1, a2, a3⟩= ⟨ 9, 4, 6 ⟩:

Page 10: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.10© Charles E. Leiserson and Piotr Indyk

Decision-tree example

1:21:2

2:32:3

123123 1:31:3

132132 312312

1:31:3

213213 2:32:3

231231 321321

Each internal node is labeled i:j for i, j ∈ 1, 2,…, n.•The left subtree shows subsequent comparisons if ai ≤ aj.•The right subtree shows subsequent comparisons if ai ≥ aj.

4 ≤ 6

Sort ⟨a1, a2, a3⟩= ⟨ 9, 4, 6 ⟩:

Page 11: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.11© Charles E. Leiserson and Piotr Indyk

Decision-tree example

1:21:2

2:32:3

123123 1:31:3

132132 312312

1:31:3

213213 2:32:3

231231 321321

Each leaf contains a permutation ⟨π(1), π(2),…, π(n)⟩ to indicate that the ordering aπ(1) ≤ aπ(2) ≤ L ≤ aπ(n) has been established.

4 ≤ 6 ≤ 9

Sort ⟨a1, a2, a3⟩= ⟨ 9, 4, 6 ⟩:

Page 12: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.12© Charles E. Leiserson and Piotr Indyk

Decision-tree modelA decision tree can model the execution of any comparison sort:• One tree for each input size n. • View the algorithm as splitting whenever it compares two

elements.• The tree contains the comparisons along all possible

instruction traces.• The number of comparisons done by the algorithm on a given

input =…the length of the path taken.

• Worst-case number of comparisons =…max path length = height of tree.

• Worst-case time ≥ worst-case number of comparisons

Page 13: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.13© Charles E. Leiserson and Piotr Indyk

Lower bound for decision-tree sorting

Theorem. Any decision tree that can sort n elements must have height Ω(n lg n) .

Corollary. Any comparison sorting algorithm has worst-case running time Ω(n lg n).

Corollary 2. Merge sort and Heap Sort areasymptotically optimal comparison sorting algorithms.

Page 14: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.14© Charles E. Leiserson and Piotr Indyk

Lower bound for decision-tree sorting

Theorem. Any decision tree that can sort n elements must have height Ω(n lg n) .Proof.• The tree must contain ≥ n! leaves, since there are n! possible permutations• A height-h binary tree has ≤ 2h leaves• Thus, 2h ≥ #leaves ≥ n! , or h ≥ lg(n!)

Page 15: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.15© Charles E. Leiserson and Piotr Indyk

Proof, ctd.

2h ≥ n! ≥ n*(n-1)*…* n/2≥ (n/2)n/2

⇒ h ≥ lg( (n/2)n/2 )≥ (n/2) (lg n –lg 2)= Ω(n lg n) .

Page 16: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.16© Charles E. Leiserson and Piotr Indyk

Example: sorting 3 elements

Recall h ≥ lg(n!)• n=3• n!=6• log26 = 2.58• Sorting 3 elements requires…

… ≥3 comparisons in the worst case

Page 17: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.17© Charles E. Leiserson and Piotr Indyk

Decision-tree for n=3

1:21:2

2:32:3

123123 1:31:3

132132 312312

1:31:3

213213 2:32:3

231231 321321

Sort ⟨a1, a2, a3⟩

Page 18: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.18© Charles E. Leiserson and Piotr Indyk

Sorting in linear time

Counting sort: No comparisons between elements.• Input: A[1 . . n], where A[ j]∈1, 2, …, k .• Output: B[1 . . n], sorted*• Auxiliary storage: C[1 . . k] .

*Actually, we require the algorithm to construct a permutation of the inputarray A that produces the sorted array B. This permutation can be obtainedby making small changes to the last loop of the algorithm.

Page 19: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.19© Charles E. Leiserson and Piotr Indyk

Counting sort

for i ← 1 to kdo C[i] ← 0

for j ← 1 to ndo C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |key = i|

for i ← 2 to kdo C[i] ← C[i] + C[i–1] ⊳ C[i] = |key ≤ i|

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

Page 20: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.20© Charles E. Leiserson and Piotr Indyk

Counting-sort example

A: 44 11 33 44 33

B:

1 2 3 4 5

C:1 2 3 4

Page 21: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.21© Charles E. Leiserson and Piotr Indyk

Loop 1

A: 44 11 33 44 33

B:

1 2 3 4 5

C: 00 00 00 001 2 3 4

for i ← 1 to kdo C[i] ← 0

Page 22: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.22© Charles E. Leiserson and Piotr Indyk

Loop 2

A: 44 11 33 44 33

B:

1 2 3 4 5

C: 00 00 00 111 2 3 4

for j ← 1 to ndo C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |key = i|

Page 23: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.23© Charles E. Leiserson and Piotr Indyk

Loop 2

A: 44 11 33 44 33

B:

1 2 3 4 5

C: 11 00 00 111 2 3 4

for j ← 1 to ndo C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |key = i|

Page 24: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.24© Charles E. Leiserson and Piotr Indyk

Loop 2

A: 44 11 33 44 33

B:

1 2 3 4 5

C: 11 00 11 111 2 3 4

for j ← 1 to ndo C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |key = i|

Page 25: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.25© Charles E. Leiserson and Piotr Indyk

Loop 2

A: 44 11 33 44 33

B:

1 2 3 4 5

C: 11 00 11 221 2 3 4

for j ← 1 to ndo C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |key = i|

Page 26: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.26© Charles E. Leiserson and Piotr Indyk

Loop 2

A: 44 11 33 44 33

B:

1 2 3 4 5

C: 11 00 22 221 2 3 4

for j ← 1 to ndo C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |key = i|

Page 27: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.27© Charles E. Leiserson and Piotr Indyk

Loop 3

A: 44 11 33 44 33

B:

1 2 3 4 5

C: 11 00 22 221 2 3 4

C': 11 11 22 22

for i ← 2 to kdo C[i] ← C[i] + C[i–1] ⊳ C[i] = |key ≤ i|

Page 28: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.28© Charles E. Leiserson and Piotr Indyk

Loop 3

A: 44 11 33 44 33

B:

1 2 3 4 5

C: 11 00 22 221 2 3 4

C': 11 11 33 22

for i ← 2 to kdo C[i] ← C[i] + C[i–1] ⊳ C[i] = |key ≤ i|

Page 29: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.29© Charles E. Leiserson and Piotr Indyk

Loop 3

A: 44 11 33 44 33

B:

1 2 3 4 5

C: 11 00 22 221 2 3 4

C': 11 11 33 55

for i ← 2 to kdo C[i] ← C[i] + C[i–1] ⊳ C[i] = |key ≤ i|

Page 30: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.30© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B:

1 2 3 4 5

C': 11 11 33 55

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 31: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.31© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B: 33

1 2 3 4 5

C': 11 11 33 55

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 32: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.32© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B: 33

1 2 3 4 5

C': 11 11 22 55

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 33: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.33© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B: 33 44

1 2 3 4 5

C': 11 11 22 55

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 34: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.34© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B: 33 44

1 2 3 4 5

C': 11 11 22 44

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 35: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.35© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B: 33 33 44

1 2 3 4 5

C': 11 11 22 44

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 36: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.36© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B: 33 33 44

1 2 3 4 5

C': 11 11 11 44

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 37: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.37© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B: 11 33 33 44

1 2 3 4 5

C': 11 11 11 44

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 38: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.38© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B: 11 33 33 44

1 2 3 4 5

C': 00 11 11 44

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 39: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.39© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B: 11 33 33 44 44

1 2 3 4 5

C': 00 11 11 44

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 40: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.40© Charles E. Leiserson and Piotr Indyk

Loop 4

A: 44 11 33 44 33

B: 11 33 33 44 44

1 2 3 4 5

C': 00 11 11 33

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1

1 2 3 4

Page 41: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.41© Charles E. Leiserson and Piotr Indyk

B vs C

B: 11 33 33 44 44 C': 11 11 33 55

In the end, each element i occupies the rangeB[C[i-1]+1 … C[i]]

1 2 3 41 2 3 4 5

Page 42: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.42© Charles E. Leiserson and Piotr Indyk

Analysisfor i ← 1 to k

do C[i] ← 0

Θ(n)

Θ(k)

Θ(n)

Θ(k)

for j ← 1 to ndo C[A[ j]] ← C[A[ j]] + 1

for i ← 2 to kdo C[i] ← C[i] + C[i–1]

for j ← n downto 1do B[C[A[ j]]] ← A[ j]

C[A[ j]] ← C[A[ j]] – 1Θ(n + k)

Page 43: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.43© Charles E. Leiserson and Piotr Indyk

Running time

If k = O(n), then counting sort takes Θ(n) time.• But, sorting takes Ω(n lg n) time!• Why ?

Answer:• Comparison sorting takes Ω(n lg n) time.• Counting sort is not a comparison sort.• In fact, not a single comparison between

elements occurs!

Page 44: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.44© Charles E. Leiserson and Piotr Indyk

Stable sorting

Counting sort is a stable sort: it preserves the input order among equal elements.

A: 44 11 33 44 33

B: 11 33 33 44 44

Page 45: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.45© Charles E. Leiserson and Piotr Indyk

Sorting integers

• We can sort n integers from 1, 2, …, k in O(n+k) time

• This is nice if k=O(n)• What if, say, k=n2 ?

Page 46: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.46© Charles E. Leiserson and Piotr Indyk

Radix sort

• Origin: Herman Hollerith’s card-sorting machine for the 1890 U.S. Census. (See Appendix .)

• Digit-by-digit sort.• Hollerith’s original (bad) idea: sort on

most-significant digit first.• Good idea: Sort on least-significant digit

first with auxiliary stable sort.

Page 47: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.47© Charles E. Leiserson and Piotr Indyk

Operation of radix sort

3 2 94 5 76 5 78 3 94 3 67 2 03 5 5

7 2 03 5 54 3 64 5 76 5 73 2 98 3 9

7 2 03 2 94 3 68 3 93 5 54 5 76 5 7

3 2 93 5 54 3 64 5 76 5 77 2 08 3 9

Page 48: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.48© Charles E. Leiserson and Piotr Indyk

• Sort on digit t

Correctness of radix sortInduction on digit position • Assume that the numbers

are sorted by their low-order t – 1 digits.

7 2 03 2 94 3 68 3 93 5 54 5 76 5 7

3 2 93 5 54 3 64 5 76 5 77 2 08 3 9

Page 49: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.49© Charles E. Leiserson and Piotr Indyk

• Sort on digit t

Correctness of radix sortInduction on digit position • Assume that the numbers

are sorted by their low-order t – 1 digits.

7 2 03 2 94 3 68 3 93 5 54 5 76 5 7

3 2 93 5 54 3 64 5 76 5 77 2 08 3 9

Two numbers that differ in digit t are correctly sorted.

Page 50: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.50© Charles E. Leiserson and Piotr Indyk

• Sort on digit t

Correctness of radix sortInduction on digit position • Assume that the numbers

are sorted by their low-order t – 1 digits.

7 2 03 2 94 3 68 3 93 5 54 5 76 5 7

3 2 93 5 54 3 64 5 76 5 77 2 08 3 9

Two numbers that differ in digit t are correctly sorted.Two numbers equal in digit tare put in the same order as the input ⇒ correct order.

Page 51: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.51© Charles E. Leiserson and Piotr Indyk

Analysis of radix sort

• Assume counting sort is the auxiliary stable sort.• Sort n computer words of b bits eachE.g., if we sort elements in 1…n2 , b=2 lg n

• Each word can be viewed as having b/r base-2r

digits.

Example: 32-bit word8 8 8 8

r = 8 ⇒ b/r = 4 passes of counting sort on base-28 digits; or r = 16 ⇒ b/r = 2 passes of counting sort on base-216

digits.

Page 52: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.52© Charles E. Leiserson and Piotr Indyk

Analysis (continued)Recall: Counting sort takes Θ(n + k) time to sort n numbers in the range from 0 to k – 1.If each b-bit word is broken into r-bit pieces, each pass of counting sort takes Θ(n + 2r) time. Since there are b/r passes, we have

( )

+Θ= rn

rbbnT 2),( .

Choose r to minimize T(n, b):• Increasing r means fewer passes, but as

r > lg n, the time grows exponentially.>

Page 53: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.53© Charles E. Leiserson and Piotr Indyk

Choosing r( )

+Θ= rn

rbbnT 2),(

Minimize T(n, b) by differentiating and setting to 0.Or, just observe that we don’t want 2r > n, and there’s no harm asymptotically in choosing r as large as possible subject to this constraint.

>

Choosing r = lg n implies T(n, b) = Θ(bn/lg n) .

• For numbers in the range from 0 to nd – 1, we have b = d lg n ⇒ radix sort runs in Θ(d n) time.

Page 54: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.54© Charles E. Leiserson and Piotr Indyk

Conclusions

Example (32-bit numbers):• At most 3 passes when sorting ≥ 2000 numbers.• Merge sort and quicksort do at least lg 2000 =

11 passes.

In practice, radix sort is fast for large inputs, as well as simple to code and maintain.

Downside: Unlike quicksort, radix sort displays little locality of reference.

Page 55: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.55© Charles E. Leiserson and Piotr Indyk

Appendix: Punched-card technology

• Herman Hollerith (1860-1929)• Punched cards• Hollerith’s tabulating system• Operation of the sorter• Origin of radix sort• “Modern” IBM card• Web resources on punched-

card technologyReturn to last slide viewed.

Page 56: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.56© Charles E. Leiserson and Piotr Indyk

Herman Hollerith(1860-1929)

• The 1880 U.S. Census took almost10 years to process.

• While a lecturer at MIT, Hollerith prototyped punched-card technology.

• His machines, including a “card sorter,” allowed the 1890 census total to be reported in 6 weeks.

• He founded the Tabulating Machine Company in 1911, which merged with other companies in 1924 to form International Business Machines.

Image removed due to copyright considerations.

Page 57: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.57© Charles E. Leiserson and Piotr Indyk

Punched cards• Punched card = data record.• Hole = value. • Algorithm = machine + human operator.

Image removed due to copyright considerations.

Page 58: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.58© Charles E. Leiserson and Piotr Indyk

Hollerith’s tabulating system•Pantograph card punch

•Hand-press reader•Dial counters•Sorting box

Image removed due to copyright considerations.

Page 59: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.59© Charles E. Leiserson and Piotr Indyk

Operation of the sorter• An operator inserts a card into

the press.• Pins on the press reach through

the punched holes to make electrical contact with mercury-filled cups beneath the card.

• Whenever a particular digit value is punched, the lid of the corresponding sorting bin lifts.

• The operator deposits the card into the bin and closes the lid.

• When all cards have been processed, the front panel is opened, and the cards are collected in order, yielding one pass of a stable sort.

Image removed due to copyright considerations.

Page 60: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.60© Charles E. Leiserson and Piotr Indyk

Origin of radix sort

Hollerith’s original 1889 patent alludes to a most-significant-digit-first radix sort:

“The most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations, then reassorting each group according to the second item entering into the combination, and so on, and finally counting on a few counters the last item of the combination for each group of cards.”

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators.

Page 61: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.61© Charles E. Leiserson and Piotr Indyk

“Modern” IBM card

So, that’s why text windows have 80 columns!

• One character per column.

Image removed due to copyright considerations.

Page 62: Introduction to Algorithms - Duke Universityreif/courses/alglectures/indyk.lectures/... · © Charles E. Leiserson and Piotr Indyk Introduction to Algorithms September 27, 2004 L6.2

Introduction to Algorithms September 27, 2004 L6.62© Charles E. Leiserson and Piotr Indyk

Web resources on punched-card technology

• Doug Jones’s punched card index• Biography of Herman Hollerith• The 1890 U.S. Census• Early history of IBM• Pictures of Hollerith’s inventions• Hollerith’s patent application (borrowed

from Gordon Bell’s CyberMuseum)• Impact of punched cards on U.S. history