2. Outline
S Hash Table Overview
S Hashing Overview
S Add Items
S Remove Items
S Search For Items
S Enumerate Items
3. Hash Table Overview
S Associative Array
S Storage of Key / Value Pairs
S Like an array, but the index can be any comparable type
S Each Key is Unique, though Keys can point to the same
value
S The Key Type is mapped to an Index
4. Hashed???
S Hashing derives a fixed size result from an input
S every hash returns same size and type
S Stable
S The same input generates the same output ALWAYS
S Uniform
S The hash value use should be uniformly distributed through available
space (though impossible to have perfect uniformity)
S Efficient
S The cost of generating a hash must be balanced with application needs
S Secure
S The cost of finding data that produces a given hash is prohibitive
5. Hashing A String
S Naïve implementation
S Summing the ASCII value for each character
F O O 102 111 111 324
S Pros
S Stable
S Efficient
S Cons
S Not Uniform
S AdditiveHash(“foo”) = AdditiveHash(“oof ”)
S Not Secure
6. Hashing A String
S Somewhat better
S “Folds” bytes of every four characters into an integer (32bit)
Lore m ip sum dolo r
170199844 1885937773 54404403 1869377380 114
-707031079 -162986676 1706390704 1706390818
S Pros
S Stable, Efficient and better uniformity
S Cons
S Not secure (can be treated essentially as additive)
7. Hashing Functions
S There are lots of good hashing algorithms, you don’t have to write
your own.
S Pick the right hash for the job at hand (all these are available in the
.net framework)
Name Stable Uniform Efficient Secure
Additive ✔ ✔
Folding ✔ ✔ ✔
CRC32 ✔ ✔ ✔
MD5 ✔ ✔
SHA-2 ✔ ✔ ✔
8. Hash Table Overview
S Adding Jane
S int index = GetIndex(Jane.Name);
S _array[index] = Jane;
S What does GetIndex() do? It hashes the string
10. Handling Collisions
S Two distinct items have the same hash value
S Items are assigned to the same index in the hash table
S Two common strategies
S Open Addressing
S Moving to next index in the table
S Chaining
S Storing items into a linked list
S Frequency of Collisions
S # of slots in the array
S # of filled slots in the array
11. Finding Items
S Items are found by key
S Person p = HashTable.Find(“Jane”)
S Open Addressing
S Get the index of the key
S If the value != null
S If keys match, return value
S If keys don’t match, check next index
S Chaining
S Get index of the key
S Find index in the list
12. Removing Items
S Items are removed by key
S HashTable.Remove(“Jane”)
S Open Addressing
S Get index of the key
S If value != null
S If keys match, remove
S If keys don’t match, check next index
S Chaining
S Get index of the key
S Remove item from the linked list
13. Enumerating Keys & Values
S Open Addressing
S Foreach(item in array)
{if(item!=null) return item;}
S Chaining
S Foreach(list in array)
{if(list !=null)
{foreach(item in list){return item;}}
}