of 29

• date post

28-Jan-2016
• Category

## Documents

• view

38

0

Embed Size (px)

description

ΜΑΘΗΜΑ 10 ο. Πίνακες Κατακερματισμού Υλικό εκτός εξετάσιμης ύλης για την χρονιά 2007-2008. Hashing Lecture 10. hashing methods and their implementation hash tables scatter tables. - PowerPoint PPT Presentation

### Transcript of ΜΑΘΗΜΑ 10 ο

• 10

2007-2008.

• Hashing Lecture 10hashing methods and their implementationhash tablesscatter tables Data Structures and Algorithms with Object Oriented Design Patterns in Java, Bruno R. Preiss, John Willey and Sons.

• Rationale for Hashingmany applications: information store & retrievalconsider containers storing {key, value} pairshashing-based containers suitable for applications with frequently executed basic operations [find(key), insert]items are not required to be ordered main advantage: find, insert are O(1) (average case)

• Hashing MethodsSection 8.1-3basic idea hash functionscharacteristicscommon methodsdealing with arbitrary keys

• Array-Based Implementation of the Containerfor clarity of presentation, consider the case when the container contains keys onlyan array will hold some number of items of a given set of keys Kthe position of a key in the array is given by a hash function h()in general |K| is large or even unboundedthe actual number of items stored in the container is typically much less than |K| use an array of size M

• Hash Functionswe need a function

• Characteristics of a Good Hash Functiongood hash function avoids collisionsspreads keys evenly in the arrayits computation is fast

• Hashing Methodsdivision method: h(x) = x mod Mmiddle-square method W=2word lengthmultiplication methodFibonacci hashing

• Implementing the Division Methodpublic class DivisionMethod {static final int M = 1031; // a primepublic static int h (int x) { return Math.abs (x) % M;}}

• Implementing the Multiplication & Fibonacci Methods

public class MultiplicationMethod {static final int k = 10; // M==1024static final int w = 32;static final int a = (int) 2654435769L;

public static int h(int x) { return (x * a) >>> (w - k); }}

• Dealing With Arbitrary KeysWhat if keys are not integers?function f maps keys into non-negative integers: function g maps non-negative integers into

hash function h is simply h = f gillustration

• Hashing of Stringsview a character string, s, as a sequence of n characters,{s0, s1, . . . , sn-1}simple (but very poor) hash function is e.g., for all possible a ASCII strings of length five 0 f(s) 640

• Better String Hash Functionsuppose we known a priori that all strings have length fourlet B = 256we can compute the 32-bit quantity like thisf(s) = s0B3 +s1B2 +s2B +s0this function spreads out the values better

• Dealing with Arbitrary Length Stringsgeneralize the preceding approach: computes a unique integer for every possible stringunfortunately the range of f(s) is unboundeda simple modification bounds the range:

• More Elaborate Hash Function

• Hash TablesSection 8.4hash tablesmain difference: dealing with collisionsseparate chaining hash tables [illustration]running time analysisworst case, average casescatter tables [illustration]chained, open addressing running time analysis

• Hash Tablesa hash table is a searchable container that implements the HashTable interfacethe AbstractHashTable class is an abstract class from which various implementations are derived method getLength (returns M) methods f, g, and h (h = f g)the implementation shown uses the division method of hashing

• Separate Chaining Hash Tableuses separate chaining to resolve collisionshash table is implemented as an array of linked listslinked list into which an item is to b inserted is determined by hashing that itemitem is inserted into the linked list by appending it

• Example: Keys and their Hash Values Prog 8.3 (octal)

key kf(k) in octalh(k)8 = f%208 (octal)ett0144656404tva0165654505tre0165634505fyra014770634101fem0147445515sex0162447010sju0162536505atta034465654101nio0157505717tio0165505717elva04455774101tolv06556556606

• key kh(k)8 ett04tva05tre05fyra01fem15sex10sju05atta01nio17tio17elva01tolv06

• ChainedHashTable Running Time (Worst Case)constructor O(M)purge O(M)insert T + O(1)withdraw T + O(n)find O(1) +T +nT +O(n)

• Average Case Analysisconsider we have a hash table of size Mlet there be exactly n items in the hash tablethe quantity = n/M is called the load factorthe load factor is the average length of a linked list!

• Average Running Timesunsuccessful searchO(1)+T+T +O()successful searchO(1)+T+((+1)/2)T+O()if one could guarantee 1, then T = O(1) and T = O(1), consequently all operations would be O(1) [average]to guarantee this, must resize