Exercise Session 4 – Associative Data Structures

Post on 07-May-2022

3 views 0 download

Transcript of Exercise Session 4 – Associative Data Structures

Exercise Session 4 – Associative DataStructuresComputer Science II, D-ITET, ETH Zurich

Program Today

Feedback of last exercise

Repetition theoryAVL ConditionAVL InsertHashing

Programming Task

1

1. Feedback of last exercise

2

2. Repetition theory

3

Comparison of binary Trees

Search trees HeapsMin- / Max- Heap

Balanced trees AVL,red-black tree

in C++: std::make_heap std::map

3

4

5

7

9

16

1

2

235

7 9

16

1

4

2

3

4

5

7

9

16

1

Insertion Θ(h(T )) Θ(log n) Θ(log n)Search Θ(h(T )) Θ(n) (!!) Θ(log n)

Deletion Θ(h(T )) Search + Θ(log n) Θ(log n)

Recall: Θ(log n)≤ Θ(h(T ))≤ Θ(n)

4

Comparison of binary Trees

Search trees HeapsMin- / Max- Heap

Balanced trees AVL,red-black tree

in C++: std::make_heap std::map

3

4

5

7

9

16

1

2

235

7 9

16

1

4

2

3

4

5

7

9

16

1

Insertion Θ(h(T )) Θ(log n) Θ(log n)Search Θ(h(T )) Θ(n) (!!) Θ(log n)

Deletion Θ(h(T )) Search + Θ(log n) Θ(log n)Recall: Θ(log n)≤ Θ(h(T ))≤ Θ(n)

4

AVL Condition

AVL Condition: for eacn node v of a treebal(v) ∈ {−1, 0, 1}

v

Tl(v)

Tr(v)

h h + 1

h + 2

5

Balance at Insertion Point

=⇒

+1 0p p

n

case 1: bal(p) = +1

=⇒

−1 0p p

n

case 2: bal(p) = −1

Finished in both cases because the subtree height did not change

6

Balance at Insertion Point

=⇒

0 +1p p

n

case 3.1: bal(p) = 0 right

=⇒

0 −1p p

n

case 3.2: bal(p) = 0, left

Not finished in both case. Call of upin(p)

7

upin(p) - invariant

When upin(p) is called it holds thatthe subtree from p is grown andbal(p) ∈ {−1, +1}

8

upin(p)

Assumption: p is left son of pp1

=⇒

pp +1 pp 0

p p

case 1: bal(pp) = +1, done.

=⇒

pp 0 pp −1

p p

case 2: bal(pp) = 0, upin(pp)

In both cases the AVL-Condition holds for the subtree from pp

1If p is a right son: symmetric cases with exchange of +1 and −19

upin(p)

Assumption: p is left son of pp

pp −1

p

case 3: bal(pp) = −1,

This case is problematic: adding n to the subtree from pp has violated theAVL-condition. Re-balance!Two cases bal(p) = −1, bal(p) = +1

10

Rotationscase 1.1 bal(p) = −1. 2

y

x

t1

t2

t3

pp −2

p −1

h

h− 1

h− 1

h + 2 h

=⇒rotation

right

x

y

t1 t2 t3

pp 0

p 0

h h− 1 h− 1

h + 1 h + 1

2p right son: ⇒ bal(pp) = bal(p) = +1, left rotation11

Rotationscase 1.1 bal(p) = −1. 3

z

x

y

t1t2 t3

t4

pp −2

p +1

h −1/ + 1

h− 1

h− 1h− 2

h− 2h− 1

h− 1

h + 2 h

=⇒doublerotationleft-right

y

x z

t1

t2 t3t4

pp 0

0/− 1 +1/0

h− 1 h− 1h− 2

h− 2h− 1

h− 1

h + 1

3p right son⇒ bal(pp) = +1, bal(p) = −1, double rotation right left12

Quiz

In the following AVL tree, insert key 12 and rebalance (as shown in class).What does the AVL tree look like after the operation that has been shownin class?

30

10

3

1

17

14 19

50

40 60

13

Solution

17

10

3

1

14

12

30

19 50

40 60

14

Hashing well-done

Useful Hashing. . .distributes the keys as uniformly as possible in the hash table.avoids probing over long areas of used entries(e.g. primary clustering).

15

Hashing Examples

Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):

linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).

0 1 2 3 4 5 6

25 417 45

254 17 45

16

Hashing Examples

Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):

linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).

0 1 2 3 4 5 6

25

417 45

254 17 45

16

Hashing Examples

Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):

linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).

0 1 2 3 4 5 6

25 4

17 45

254 17 45

16

Hashing Examples

Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):

linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).

0 1 2 3 4 5 6

25 417

45

254 17 45

16

Hashing Examples

Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):

linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).

0 1 2 3 4 5 6

25 417 45

254 17 45

16

Hashing Examples

Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):

linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).

0 1 2 3 4 5 6

25 417 45

25

4 17 45

16

Hashing Examples

Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):

linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).

0 1 2 3 4 5 6

25 417 45

254

17 45

16

Hashing Examples

Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):

linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).

0 1 2 3 4 5 6

25 417 45

254 17

45

16

Hashing Examples

Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):

linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).

0 1 2 3 4 5 6

25 417 45

254 17 45

16

3. Programming Task

17

Finding a Sub-Array

Given: two integer arrays A = (a0, . . . , an−1) and B = (b0, . . . , bk−1)Task: Find position of B in A.

Naive: Loop through A, check whether the following k entries match B.

O(nk) comparison operations

Solution using hashing: Calculate hash h(B) and compare it toh((ai, ai+1, . . . , ai+k−1)).Avoid re-computing h((ai, ai+1, . . . , ai + k − 1) for each i =⇒ O(n)expected

18

Finding a Sub-Array

Given: two integer arrays A = (a0, . . . , an−1) and B = (b0, . . . , bk−1)Task: Find position of B in A.Naive: Loop through A, check whether the following k entries match B.

O(nk) comparison operations

Solution using hashing: Calculate hash h(B) and compare it toh((ai, ai+1, . . . , ai+k−1)).Avoid re-computing h((ai, ai+1, . . . , ai + k − 1) for each i =⇒ O(n)expected

18

Finding a Sub-Array

Given: two integer arrays A = (a0, . . . , an−1) and B = (b0, . . . , bk−1)Task: Find position of B in A.Naive: Loop through A, check whether the following k entries match B.

O(nk) comparison operations

Solution using hashing: Calculate hash h(B) and compare it toh((ai, ai+1, . . . , ai+k−1)).Avoid re-computing h((ai, ai+1, . . . , ai + k − 1) for each i =⇒ O(n)expected

18

Finding a Sub-Array

Given: two integer arrays A = (a0, . . . , an−1) and B = (b0, . . . , bk−1)Task: Find position of B in A.Naive: Loop through A, check whether the following k entries match B.

O(nk) comparison operations

Solution using hashing: Calculate hash h(B) and compare it toh((ai, ai+1, . . . , ai+k−1)).Avoid re-computing h((ai, ai+1, . . . , ai + k − 1) for each i =⇒ O(n)expected

18

Sliding Window Hash

Possible hash function: sum of all elements:

Can be updated easily: subtract ai and add ai+k.However: bad hash function

Better:

Hc,m((ai, · · · , ai+k−1)) =k−1∑

j=0ai+j · ck−j−1

mod m

c = 1021 prime numberm = 215 int, no overflows at calculations

19

Sliding Window Hash

Possible hash function: sum of all elements:

Can be updated easily: subtract ai and add ai+k.However: bad hash function

Better:

Hc,m((ai, · · · , ai+k−1)) =k−1∑

j=0ai+j · ck−j−1

mod m

c = 1021 prime numberm = 215 int, no overflows at calculations

19

Computing with Modulo

(a + b) mod m = ((a mod m) + (b mod m)) mod m

(a− b) mod m = ((a mod m)− (b mod m) + m) mod m

(a · b) mod m = ((a mod m) · (b mod m)) mod m

Exercise: Compute

12746357 mod 11

20

Computing Modulo

Exercise: Compute

12746357 mod 11

= (7 + 5 · 10 + 3 · 102 + 6 · 103 + 4 · 104 + 7 · 105 + 2 · 106 + 1 · 107) mod 11= (7 + 50 + 3 + 60 + 4 + 70 + 2 + 10) mod 11= (7 + 6 + 3 + 5 + 4 + 4 + 2 + 10) mod 11= 8 mod 11.

For the second equality we used the fact that 102 mod 11 = 1.

21

Computing Modulo

Exercise: Compute

12746357 mod 11= (7 + 5 · 10 + 3 · 102 + 6 · 103 + 4 · 104 + 7 · 105 + 2 · 106 + 1 · 107) mod 11

= (7 + 50 + 3 + 60 + 4 + 70 + 2 + 10) mod 11= (7 + 6 + 3 + 5 + 4 + 4 + 2 + 10) mod 11= 8 mod 11.

For the second equality we used the fact that 102 mod 11 = 1.

21

Computing Modulo

Exercise: Compute

12746357 mod 11= (7 + 5 · 10 + 3 · 102 + 6 · 103 + 4 · 104 + 7 · 105 + 2 · 106 + 1 · 107) mod 11= (7 + 50 + 3 + 60 + 4 + 70 + 2 + 10) mod 11

= (7 + 6 + 3 + 5 + 4 + 4 + 2 + 10) mod 11= 8 mod 11.

For the second equality we used the fact that 102 mod 11 = 1.

21

Computing Modulo

Exercise: Compute

12746357 mod 11= (7 + 5 · 10 + 3 · 102 + 6 · 103 + 4 · 104 + 7 · 105 + 2 · 106 + 1 · 107) mod 11= (7 + 50 + 3 + 60 + 4 + 70 + 2 + 10) mod 11= (7 + 6 + 3 + 5 + 4 + 4 + 2 + 10) mod 11

= 8 mod 11.

For the second equality we used the fact that 102 mod 11 = 1.

21

Computing Modulo

Exercise: Compute

12746357 mod 11= (7 + 5 · 10 + 3 · 102 + 6 · 103 + 4 · 104 + 7 · 105 + 2 · 106 + 1 · 107) mod 11= (7 + 50 + 3 + 60 + 4 + 70 + 2 + 10) mod 11= (7 + 6 + 3 + 5 + 4 + 4 + 2 + 10) mod 11= 8 mod 11.

For the second equality we used the fact that 102 mod 11 = 1.

21

Sliding Window Hash

template<typename It1, typename It2>It1 findOccurrence(const It1 from, const It1 to,

const It2 begin, const It2 end){

const unsigned k = end - begin;const unsigned M = 32768;const unsigned C = 1021;

// your code here// ...

22

Sliding Window Hash

// elements can be compared using std::equal:if(std::equal(window_left, window_right, begin, end))

return current;

// if no occurrence is found return end of arrayreturn to;

}

23

Sliding Window Hash

Make sure thatthe algorithm computes ck only once,all computations are modulo m for all values in order not to get anoverflow (recall the rules of modular arithmetic), andthe values are always positive (e.g., by adding multiples of m).

24