SlideShare ist ein Scribd-Unternehmen logo
1 von 72
Downloaden Sie, um offline zu lesen
Sets, maps and hash tables
Long way to Ruzzle…




2               Tecniche di programmazione   A.A. 2012/2013
Sets

Collection that cannot contain duplicate elements.
Collection Family Tree




4                 Tecniche di programmazione   A.A. 2012/2013
ArrayList vs. LinkedList
                       ArrayList                        LinkedList
add(element)          IMMEDIATE                       IMMEDIATE
remove(object)        SLUGGISH                        IMMEDIATE
get(index)            IMMEDIATE                        SLUGGISH
set(index, element)   IMMEDIATE                        SLUGGISH
add(index, element)   SLUGGISH                         SLUGGISH
remove(index)         SLUGGISH                         SLUGGISH
contains(object)      SLUGGISH                         SLUGGISH
indexOf(object)       SLUGGISH                         SLUGGISH




 5                      Tecniche di programmazione   A.A. 2012/2013
Collection Family Tree




6                 Tecniche di programmazione   A.A. 2012/2013
Set interface
       Add/remove elements
           boolean add(element)
           boolean remove(object)
       Search
           boolean contains(object)
       No positional Access!




    7                                  Tecniche di programmazione   A.A. 2012/2013
Lists vs. Sets
                    ArrayList              LinkedList                     Set
add(element)          O(1)                    O(1)                       O(1)
remove(object)     O(n) + O(n)            O(n) + O(1)                    O(1)
get(index)            O(1)                    O(n)                        n.a.
set(index, elem)      O(1)                O(n) + O(1)                     n.a.
add(index, elem)   O(1) + O(n)            O(n) + O(1)                     n.a.
remove(index)         O(n)                O(n) + O(1)                     n.a.
contains(object)      O(n)                    O(n)                       O(1)
indexOf(object)       O(n)                    O(n)                        n.a.




 8                               Tecniche di programmazione   A.A. 2012/2013
Hash Tables

A data structure implementing an associative array
Notation
    A set stores keys
    U – Universe of all possible keys
    K – Set of keys actually stored

                      U
                                       K




    10                         Tecniche di programmazione   A.A. 2012/2013
Hash Table
    Devise a function to transform each key into an index
    Use an array



           K                                T[0..m]
                      k
                                 h(∙)
                                                      h(k)




    11                        Tecniche di programmazione   A.A. 2012/2013
Hash Function
    Mapping from U to the slots of a hash table T[0…m–1]
                       h : U  {0,1,…, m–1}
    h(k) is the “hash value” of key k
    “Any key should be equally likely to hash into any of the
     m slots, independent of where any other key
     hashes to” (Simple uniform hashing)
    Compression/expansion
        h N : U  N+
             h(k) = hN(k) mod m
        hR : U  [0, 1[  R
            h(k) =  hR(k) ∙ m 

    12                             Tecniche di programmazione   A.A. 2012/2013
Hash Function - Complexity
    Usually, h(k) = O(length(k))
        length(k) « N  h(k) = O(1)




    13                            Tecniche di programmazione   A.A. 2012/2013
A simple hash function
    h : A  N+  [0, m-1]
    Split the key into its “component”, then sum their integer
     representation                  𝑛

    .hN 𝑘 = hN (𝑥0 𝑥1 𝑥2 … 𝑥 𝑛 ) =          𝑥𝑖
                                      𝑖=0

    h(k) = hN(k) % m




    14                        Tecniche di programmazione   A.A. 2012/2013
A simple hash (problems)
    Problems
        hN(“NOTE”) = 78+79+84+69 = 310
        hN(“TONE”) = 310
        hN(“STOP”) = 83+84+79+80 = 326
        hN(“SPOT”) = 326


    Problems (m = 173)
        h(74,778) = 42
        h(16,823) = 42
        h(1,611,883) = 42


    15                        Tecniche di programmazione   A.A. 2012/2013
Collisions

                                                                        0
                  U
        (universe of keys)                                               h(k1)

                                                                         h(k4)
        K    k1        k4
     (actual k2                          collision
                                                                         h(k2)=h(k5)
      keys)                 k5
                  k3

                                                                         h(k3)


                                                                        m–1

16                               Tecniche di programmazione   A.A. 2012/2013
Collisions
    Collisions are possible!
    Multiple keys can hash to the same slot
        Design hash functions such that collisions are minimized
    But avoiding collisions is impossible.
        Design collision-resolution techniques
    Search will cost Ө(n) time in the worst case
    However, all operations can be made to have an expected
     complexity of Ө(1).




    17                             Tecniche di programmazione   A.A. 2012/2013
Hash functions
    Simple uniform hashing
    Hash value should be independent of any patterns that
     might exist in the data
    No funneling




    18                       Tecniche di programmazione   A.A. 2012/2013
Natural numbers
    An hash function may assume that the keys are natural
     numbers
    When they are not, have to “interpret” them as natural
     numbers




    19                        Tecniche di programmazione   A.A. 2012/2013
Natural numbers hashing
    Division Method (compression)
                        h(k) = k mod m
    Pros
        Fast, since requires just one division operation
    Cons
        Have to avoid certain values of m
    Good choice for m (recipe)
        Prime
        Not “too close” to powers of 2
        Not “too close” to powers of 10



    20                               Tecniche di programmazione   A.A. 2012/2013
Natural numbers hashing
    Multiplication Method I
                  hR(k) = < k ∙ A > = (k ∙ A – k ∙ A)
                          h(k) =  m ∙ hR(k) 
    Pros
        Value of m is not critical (typically m=2p)
    Cons
        Value of A is critical
    Good choice for A (Donald Knuth)
                    5−1
        A==
                     2



    21                               Tecniche di programmazione   A.A. 2012/2013
Natural numbers hashing
    Multiplication Method II
                          h(k) = k ∙ 2,654,435,761
    Pros
        Works well for addresses
    Caveat (Donald Knuth)
        2,654,435,761 = 232 ∙ 




    22                              Tecniche di programmazione   A.A. 2012/2013
Resolution of collisions
    Open Addressing
        When collisions occur, use a systematic
         (consistent) procedure to store elements in free
         slots of the table
        “Double hashing”, “linear probing”, …


    Chaining                                                                     0
                                                                                      k1   k4
        Store all elements that hash to the same slot in a
         linked list                                                                  k5   k2   k6

                                                                                      k7   k3
                                                                                      k8

                                                                                  m–1




    23                              Tecniche di programmazione   A.A. 2012/2013
Chaining

                                                           0
                         U
        (universe of keys)                                                h(k1)=h(k4)
                                                      X
                    k1
                               k4
        K
     (actual        k2                  k6            X
                              k5                                          h(k2)=h(k5)=h(k6)
      keys)
               k8                  k7
                         k3
                                                     X                    h(k3)=h(k7)
                                                                          h(k8)
                                                        m–1

24                                           Tecniche di programmazione   A.A. 2012/2013
Chaining

                                                0
                     U
     (universe of keys)                                          k1            k4

                k1
                           k4
    K
 (actual        k2                  k6
  keys)
                          k5                                     k5            k2      k6
           k8                  k7
                     k3
                                                                 k7            k3

                                                                 k8
                                             m–1

25                                       Tecniche di programmazione   A.A. 2012/2013
Chaining (analysis)
    Load factor α = n/m = average keys per slot
        n – number of elements stored in the hash table
        m – number of slots




    26                            Tecniche di programmazione   A.A. 2012/2013
Chaining (analysis)
    Worst-case complexity:
                (n) ( + time to compute h(k) )




    27                       Tecniche di programmazione   A.A. 2012/2013
Chaining (analysis)
    Average depends on how h(∙) distributes keys among m
     slots
    Let assume
        Any key is equally likely to hash into any of the m slots
        h(k) = O(1)
    Expected length of a linked list = load factor = α = n/m
    Search(x) = O(α) + O(1) ≈ O(1)




    28                              Tecniche di programmazione   A.A. 2012/2013
A note on iterators
    Collection extends Iterable
    An Iterator is an object that enables you to traverse
     through a collection (and to remove elements from the
     collection selectively)
    You get an Iterator for a collection by calling its iterator()
     method
public interface Iterator<E> {
    boolean hasNext();
    E next();
    void remove(); //optional
}




    29                          Tecniche di programmazione   A.A. 2012/2013
Collection Family Tree




30                Tecniche di programmazione   A.A. 2012/2013
HashSet
    Add/remove elements
        boolean add(element)
        boolean remove(object)
    Search
        boolean contains(object)
    No positional Access
    Unpredictable iteration order!




    31                              Tecniche di programmazione   A.A. 2012/2013
Collection Family Tree




32                Tecniche di programmazione   A.A. 2012/2013
LinkedHashSet
    Add/remove elements
        boolean add(element)
        boolean remove(object)
    Search
        boolean contains(object)
    No positional Access
    Predictable iteration order




    33                              Tecniche di programmazione   A.A. 2012/2013
Costructors
    public HashSet()
    public HashSet(Collection<? extends E> c)
    HashSet(int initialCapacity)
    HashSet(int initialCapacity, float loadFactor)




    34                          Tecniche di programmazione   A.A. 2012/2013
Costructors
    public HashSet()
    public HashSet(Collection<? extends E> c)
    HashSet(int initialCapacity)
    HashSet(int initialCapacity, float loadFactor)
                                               75%




    35                          Tecniche di programmazione   A.A. 2012/2013
JCF’s HashSet
    Built-in hash function
    Dynamic hash table resize
    Smoothly handles collisions (chaining)
    (1) operations (well, usually)
    Take it easy!




    36                        Tecniche di programmazione   A.A. 2012/2013
Default hash function in Java
    In Java every class must provide a hashCode() method
     which digests the data stored in an instance of the class
     into a single 32-bit value
    In Java 1.2, Joshua Bloch implemented the java.lang.String
     hashCode() using a product sum over the entire text of
     the string
                 𝑛−1
     ℎ 𝑠 = 𝑖=0 𝑠[𝑖] ∙ 31 𝑛−1−𝑖
    But the basic Object’s hashCode() is
     implemented by converting the
     internal address of the object
     into an integer!

    37                         Tecniche di programmazione   A.A. 2012/2013
Understanding hash in Java
public class MyData {
   public String name;
   public String surname;
   int age;
}




 38                         Tecniche di programmazione   A.A. 2012/2013
Understanding hash in Java
MyData foo = new MyData();
MyData bar = new MyData();

if(foo.hashCode() == bar.hashCode()) {
   System.out.println("FLICK");
} else {
   System.out.println("FLOCK");
}




 39                          Tecniche di programmazione   A.A. 2012/2013
Understanding hash in Java
MyData foo = new MyData();
MyData bar = new MyData();

foo.name = "Stephane";
foo.surname = "Hessel";
foo.age = 95;
bar.name = "Stephane";
bar.surname = "Hessel";
bar.age = 95;

if(foo.hashCode() == bar.hashCode()) {
   System.out.println("FLICK");
} else {
   System.out.println("FLOCK");
}




 40                          Tecniche di programmazione   A.A. 2012/2013
Default hash function in Java
public boolean equals(Object obj);
public int hashCode();

    If two objects are equal according to the equals()
     method, then hashCode() must produce the same result
    If two objects are not equal according to the equals()
     method, performances are better whether the
     hashCode() produces different results




    41                       Tecniche di programmazione   A.A. 2012/2013
Hash functions in Java
public boolean equals(Object obj);
public int hashCode();



        hashCode() and equals() should
        always be defined together




 42                       Tecniche di programmazione   A.A. 2012/2013
public class MyData
public String name;
public String surname;
int age;

public MyData() { }

public MyData(String n, String s, int a) {
   name = n;
   surname = s;
   age = a;
}

[…]




 43                       Tecniche di programmazione   A.A. 2012/2013
public class MyData
@Override
public boolean equals(Object obj) {
   if (obj == this) {
      return true; // quite obvious ;-)
   }
   if (obj == null || obj instanceof MyData == false) {
      return false; // not even comparable
   }
   // the real check!
   if(name.equals(((MyData)obj).name) == false ||
        surname.equals(((MyData)obj).surname) == false) {
      return false;
   }
   return true;
}

[…]

 44                       Tecniche di programmazione   A.A. 2012/2013
public class MyData
@Override
public boolean equals(Object obj) {
   if (obj == this) {
       return true; // quite obvious ;-)
   }
   if (obj == null || obj instanceof MyData == false) {
     The annotation @Override signals the compiler
       return false; // not even comparable
   } that overriding is expected,
   // the real has to fail if an override does not occur
     and that it check!
   if(name.equals(((MyData)obj).name) == false ||
         surname.equals(((MyData)obj).surname) == false) {
       return false;
   }
   return true;
}

[…]

 45                        Tecniche di programmazione   A.A. 2012/2013
public class MyData
@Override
public int hashCode() {
   String tmp = name+":"+surname;
   return tmp.hashCode();
}


                 tmp will be “null:null” if MyData has
                 not been initialized




 46                         Tecniche di programmazione   A.A. 2012/2013
Implementing your own hash functions
    Grab your hash function from a professional




    47                       Tecniche di programmazione   A.A. 2012/2013
Trivial Hash Function
This hash function helps creating predictable collisions (e.g., “ape” and “pea”)

public long TrivialHash(String str)
{
   long hash = 0;

         for(int i = 0; i < str.length(); i++)
         {
            hash = hash + str.charAt(i);
         }

         return hash;
}




    48                               Tecniche di programmazione   A.A. 2012/2013
BKDR Hash Function
This hash function comes from Brian Kernighan and Dennis Ritchie's book
"The C Programming Language". It is a simple hash function using a strange set
of possible seeds which all constitute a pattern of 31....31...31 etc, it seems to
be very similar to the DJB hash function.
public long BKDRHash(String str)
{
   long seed = 131; // 31 131 1313 13131 131313 etc..
   long hash = 0;

         for(int i = 0; i < str.length(); i++)
         {
            hash = (hash * seed) + str.charAt(i);
         }

         return hash;
}


    49                               Tecniche di programmazione   A.A. 2012/2013
RS Hash Function
A simple hash function from Robert Sedgwicks Algorithms in C book
public long RSHash(String str)
{
   int b     = 378551;
   int a     = 63689;
   long hash = 0;

         for(int i = 0; i < str.length(); i++)
         {
            hash = hash * a + str.charAt(i);
            a    = a * b;
         }

         return hash;
}



    50                           Tecniche di programmazione   A.A. 2012/2013
DJB Hash Function
An algorithm produced by Professor Daniel J. Bernstein and shown first to the
world on the usenet newsgroup comp.lang.c. It is one of the most efficient
hash functions ever published
public long DJBHash(String str)
{
   long hash = 5381;

         for(int i = 0; i < str.length(); i++)
         {
            hash = hash * 33 + str.charAt(i);
         }

         return hash;
}




    51                             Tecniche di programmazione   A.A. 2012/2013
JS Hash Function
A bitwise hash function written by Justin Sobel
public long JSHash(String str)
{
   long hash = 1315423911;

         for(int i = 0; i < str.length(); i++)
         {
            hash ^= ((hash << 5) + str.charAt(i) + (hash >> 2));
         }

         return hash;
}




    52                              Tecniche di programmazione   A.A. 2012/2013
SDBM Hash Function
This is the algorithm of choice which is used in the open source SDBM
project. The hash function seems to have a good over-all distribution for many
different data sets. It seems to work well in situations where there is a high
variance in the MSBs of the elements in a data set.
public long SDBMHash(String str)
{
   long hash = 0;

         for(int i = 0; i < str.length(); i++)
         {
            hash = str.charAt(i) + (hash << 6) +
               (hash << 16) - hash;
         }

         return hash;
}

    53                              Tecniche di programmazione   A.A. 2012/2013
DEK Hash Function
An algorithm proposed by Donald E. Knuth in The Art Of Computer
Programming (Volume 3), under the topic of sorting and search chapter 6.4.
public long DEKHash(String str)
{
   long hash = str.length();

         for(int i = 0; i < str.length(); i++)
         {
            hash = ((hash << 5) ^ (hash >> 27)) ^ str.charAt(i);
         }

         return hash;
}




    54                             Tecniche di programmazione   A.A. 2012/2013
DJB Hash Function
The algorithm by Professor Daniel J. Bernstein (alternative take)

public long DJBHash(String str)
{
   long hash = 5381;

         for(int i = 0; i < str.length(); i++)
         {
            hash = ((hash << 5) + hash) ^ str.charAt(i);
         }

         return hash;
}




    55                              Tecniche di programmazione   A.A. 2012/2013
PJW Hash Function
This hash algorithm is based on work by Peter J. Weinberger of AT&T Bell
Labs. The book Compilers (Principles, Techniques and Tools) by Aho, Sethi and
Ulman, recommends the use of hash functions that employ the hashing
methodology found in this particular algorithm
public long PJWHash(String str)
{
   long BitsInUnsigned = (long)(4 * 8);
   long ThreeQuarters = (long)((BitsInUnsigned *3) /4);
   long OneEighth      = (long)(BitsInUnsigned / 8);
   long HighBits       = (long)(0xFFFFFFFF) <<
                           (BitsInUnsigned - OneEighth);
   long hash           = 0;
   long test           = 0;

[…]



 56                                  Tecniche di programmazione   A.A. 2012/2013
PJW Hash Function
[…]

     for(int i = 0; i < str.length(); i++)
      {
         hash = (hash << OneEighth) + str.charAt(i);

             if((test = hash & HighBits) != 0)
             {
                hash = (( hash ^ (test >> ThreeQuarters)) &
                             (~HighBits));
             }
         }

         return hash;
}




    57                           Tecniche di programmazione   A.A. 2012/2013
ELF Hash Function
Similar to the PJW Hash function, but tweaked for 32-bit processors. Its the
hash function widely used on most UNIX systems
public long ELFHash(String str)
{
   long hash = 0, x = 0;

         for(int i = 0; i < str.length(); i++)
         {
            hash = (hash << 4) + str.charAt(i);
            if((x = hash & 0xF0000000L) != 0)
               hash ^= (x >> 24);
            hash &= ~x;
         }
         return hash;
}



    58                              Tecniche di programmazione   A.A. 2012/2013
59   Tecniche di programmazione   A.A. 2012/2013
Maps

a.k.a, associative array, map, or dictionary
Definition
    In computer science, an associative array, map, or
     dictionary is an abstract data type composed of (key,
     value) pairs, such that each possible key appears at most
     once
    Native support in most modern languages (perl, python,
     ruby, go, …). E.g.,
V1[42] = “h2g2”
V2[“h2g2”] = 42

    Implemented through hash tables



    61                        Tecniche di programmazione   A.A. 2012/2013
Java Collection Framework




62               Tecniche di programmazione   A.A. 2012/2013
Collection Family Tree




63                Tecniche di programmazione   A.A. 2012/2013
Map interface
    Map<K,V>
        K: the type of keys maintained by this map
        V: the type of mapped values
    Add/remove elements
        value put(key, value)
        value remove(key)
    Search
        boolean containsKey(key)
        boolean containsValue(value)




    64                             Tecniche di programmazione   A.A. 2012/2013
Map interface
              containsValue() will probably
    Map<K,V>
              require time linear in the map size
      K: the type of keys maintained by this map
              for most implementations of the
      V: the type of mapped values
              Map interface – ie. it is O(N)
    Add/remove elements
        value put(key, value)
        value remove(key)
    Search
        boolean containsKey(key)
        boolean containsValue(value)




    65                           Tecniche di programmazione   A.A. 2012/2013
Map interface (cont)
    Nested Class
        Map.Entry<K,V>
        A map entry (key-value pair).
    Set<Map.Entry<K,V>> entrySet()
        Returns a Set view of the mappings contained in this map
    Set<K> keySet()
        Returns a Set view of the keys contained in this map
    Collection<V> values()
        Returns a Collection view of the values contained in this map




    66                             Tecniche di programmazione   A.A. 2012/2013
Collection Family Tree




67                Tecniche di programmazione   A.A. 2012/2013
LinkedHashMap
    Maintains a doubly-linked list running through all of its
     entries
    This linked list defines the iteration ordering (normally
     insertion-order)
    Insertion order is not affected if a key is re-inserted




    68                        Tecniche di programmazione   A.A. 2012/2013
Easter Wrap-up
Java Collection Framework




70               Tecniche di programmazione   A.A. 2012/2013
71   Tecniche di programmazione   A.A. 2012/2013
Licenza d’uso
    Queste diapositive sono distribuite con licenza Creative Commons
     “Attribuzione - Non commerciale - Condividi allo stesso modo (CC
     BY-NC-SA)”
    Sei libero:
        di riprodurre, distribuire, comunicare al pubblico, esporre in pubblico,
         rappresentare, eseguire e recitare quest'opera
        di modificare quest'opera
    Alle seguenti condizioni:
        Attribuzione — Devi attribuire la paternità dell'opera agli autori
         originali e in modo tale da non suggerire che essi avallino te o il modo in
         cui tu usi l'opera.
        Non commerciale — Non puoi usare quest'opera per fini
         commerciali.
        Condividi allo stesso modo — Se alteri o trasformi quest'opera, o se
         la usi per crearne un'altra, puoi distribuire l'opera risultante solo con una
         licenza identica o equivalente a questa.
    http://creativecommons.org/licenses/by-nc-sa/3.0/
    72                                   Tecniche di programmazione   A.A. 2012/2013

Weitere ähnliche Inhalte

Was ist angesagt?

Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data
Let's Practice What We Preach: Likelihood Methods for Monte Carlo DataLet's Practice What We Preach: Likelihood Methods for Monte Carlo Data
Let's Practice What We Preach: Likelihood Methods for Monte Carlo DataChristian Robert
 
Reasoning about laziness
Reasoning about lazinessReasoning about laziness
Reasoning about lazinessJohan Tibell
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryMarko Rodriguez
 
(chapter 9) A Concise and Practical Introduction to Programming Algorithms in...
(chapter 9) A Concise and Practical Introduction to Programming Algorithms in...(chapter 9) A Concise and Practical Introduction to Programming Algorithms in...
(chapter 9) A Concise and Practical Introduction to Programming Algorithms in...Frank Nielsen
 
Java puzzlers scraping_the_bottom_of_the_barrel
Java puzzlers scraping_the_bottom_of_the_barrelJava puzzlers scraping_the_bottom_of_the_barrel
Java puzzlers scraping_the_bottom_of_the_barrelBharani Arul
 
Java8 stream
Java8 streamJava8 stream
Java8 streamkoji lin
 
The Functional Programming Triad of Folding, Scanning and Iteration - a first...
The Functional Programming Triad of Folding, Scanning and Iteration - a first...The Functional Programming Triad of Folding, Scanning and Iteration - a first...
The Functional Programming Triad of Folding, Scanning and Iteration - a first...Philip Schwarz
 
5.2 primitive recursive functions
5.2 primitive recursive functions5.2 primitive recursive functions
5.2 primitive recursive functionsSampath Kumar S
 
Algebra(03)_160311_02
Algebra(03)_160311_02Algebra(03)_160311_02
Algebra(03)_160311_02Art Traynor
 
A Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New OneA Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New OneGabriel Peyré
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processesKazuki Fujikawa
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationFeynman Liang
 
Data Exchange over RDF
Data Exchange over RDFData Exchange over RDF
Data Exchange over RDFnet2-project
 

Was ist angesagt? (20)

Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data
Let's Practice What We Preach: Likelihood Methods for Monte Carlo DataLet's Practice What We Preach: Likelihood Methods for Monte Carlo Data
Let's Practice What We Preach: Likelihood Methods for Monte Carlo Data
 
Reasoning about laziness
Reasoning about lazinessReasoning about laziness
Reasoning about laziness
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal Machinery
 
(chapter 9) A Concise and Practical Introduction to Programming Algorithms in...
(chapter 9) A Concise and Practical Introduction to Programming Algorithms in...(chapter 9) A Concise and Practical Introduction to Programming Algorithms in...
(chapter 9) A Concise and Practical Introduction to Programming Algorithms in...
 
Java puzzlers scraping_the_bottom_of_the_barrel
Java puzzlers scraping_the_bottom_of_the_barrelJava puzzlers scraping_the_bottom_of_the_barrel
Java puzzlers scraping_the_bottom_of_the_barrel
 
Scilab
Scilab Scilab
Scilab
 
Java8 stream
Java8 streamJava8 stream
Java8 stream
 
The Functional Programming Triad of Folding, Scanning and Iteration - a first...
The Functional Programming Triad of Folding, Scanning and Iteration - a first...The Functional Programming Triad of Folding, Scanning and Iteration - a first...
The Functional Programming Triad of Folding, Scanning and Iteration - a first...
 
5.2 primitive recursive functions
5.2 primitive recursive functions5.2 primitive recursive functions
5.2 primitive recursive functions
 
Algebra(03)_160311_02
Algebra(03)_160311_02Algebra(03)_160311_02
Algebra(03)_160311_02
 
A Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New OneA Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New One
 
Maps&hash tables
Maps&hash tablesMaps&hash tables
Maps&hash tables
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processes
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
 
04. haskell handling
04. haskell handling04. haskell handling
04. haskell handling
 
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
MUMS Undergraduate Workshop - Parameter Selection and Model Calibration for a...
 
Advance java
Advance javaAdvance java
Advance java
 
Algorithms - "quicksort"
Algorithms - "quicksort"Algorithms - "quicksort"
Algorithms - "quicksort"
 
Wi13 otm
Wi13 otmWi13 otm
Wi13 otm
 
Data Exchange over RDF
Data Exchange over RDFData Exchange over RDF
Data Exchange over RDF
 

Ähnlich wie Sets, maps and hash tables (Java Collections)

Algorithm chapter 7
Algorithm chapter 7Algorithm chapter 7
Algorithm chapter 7chidabdu
 
18. Dictionaries, Hash-Tables and Set
18. Dictionaries, Hash-Tables and Set18. Dictionaries, Hash-Tables and Set
18. Dictionaries, Hash-Tables and SetIntro C# Book
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data ManagementAlbert Bifet
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Matthew Lease
 
Learn basics of Clojure/script and Reagent
Learn basics of Clojure/script and ReagentLearn basics of Clojure/script and Reagent
Learn basics of Clojure/script and ReagentMaty Fedak
 
Monads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy DyagilevMonads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy DyagilevJavaDayUA
 
Good functional programming is good programming
Good functional programming is good programmingGood functional programming is good programming
Good functional programming is good programmingkenbot
 
11 - Programming languages
11 - Programming languages11 - Programming languages
11 - Programming languagesTudor Girba
 

Ähnlich wie Sets, maps and hash tables (Java Collections) (20)

03.01 hash tables
03.01 hash tables03.01 hash tables
03.01 hash tables
 
Algorithm chapter 7
Algorithm chapter 7Algorithm chapter 7
Algorithm chapter 7
 
Randamization.pdf
Randamization.pdfRandamization.pdf
Randamization.pdf
 
Hashing
HashingHashing
Hashing
 
Lec5
Lec5Lec5
Lec5
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
13-hashing.ppt
13-hashing.ppt13-hashing.ppt
13-hashing.ppt
 
Algorithm Assignment Help
Algorithm Assignment HelpAlgorithm Assignment Help
Algorithm Assignment Help
 
18. Dictionaries, Hash-Tables and Set
18. Dictionaries, Hash-Tables and Set18. Dictionaries, Hash-Tables and Set
18. Dictionaries, Hash-Tables and Set
 
Programming Hp33s talk v3
Programming Hp33s talk v3Programming Hp33s talk v3
Programming Hp33s talk v3
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 2: Data-Intensive Computing for Text Analysis (Fall 2011)
 
Learn basics of Clojure/script and Reagent
Learn basics of Clojure/script and ReagentLearn basics of Clojure/script and Reagent
Learn basics of Clojure/script and Reagent
 
Algorithms Design Homework Help
Algorithms Design Homework HelpAlgorithms Design Homework Help
Algorithms Design Homework Help
 
Lec5
Lec5Lec5
Lec5
 
String-based Multi-adjoint Lattices for Tracing Fuzzy Logic Computations
String-based Multi-adjoint Lattices for Tracing Fuzzy Logic ComputationsString-based Multi-adjoint Lattices for Tracing Fuzzy Logic Computations
String-based Multi-adjoint Lattices for Tracing Fuzzy Logic Computations
 
Monads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy DyagilevMonads and Monoids by Oleksiy Dyagilev
Monads and Monoids by Oleksiy Dyagilev
 
Good functional programming is good programming
Good functional programming is good programmingGood functional programming is good programming
Good functional programming is good programming
 
11 - Programming languages
11 - Programming languages11 - Programming languages
11 - Programming languages
 
Hashing PPT
Hashing PPTHashing PPT
Hashing PPT
 

Kürzlich hochgeladen

Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfMohonDas
 
Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.EnglishCEIPdeSigeiro
 
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptxClinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptxraviapr7
 
How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17Celine George
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptxSandy Millin
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxheathfieldcps1
 
UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsEugene Lysak
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphNetziValdelomar1
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
Philosophy of Education and Educational Philosophy
Philosophy of Education  and Educational PhilosophyPhilosophy of Education  and Educational Philosophy
Philosophy of Education and Educational PhilosophyShuvankar Madhu
 
5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...CaraSkikne1
 
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...Nguyen Thanh Tu Collection
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxraviapr7
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapitolTechU
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...raviapr7
 
How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17Celine George
 
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptxPractical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptxKatherine Villaluna
 
How to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesHow to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesCeline George
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfYu Kanazawa / Osaka University
 

Kürzlich hochgeladen (20)

Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdf
 
Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.
 
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptxClinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
 
How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptx
 
UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George Wells
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a Paragraph
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
Philosophy of Education and Educational Philosophy
Philosophy of Education  and Educational PhilosophyPhilosophy of Education  and Educational Philosophy
Philosophy of Education and Educational Philosophy
 
5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...
 
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptx
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptx
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...
 
How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17
 
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptxPractical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptx
 
How to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesHow to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 Sales
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
 

Sets, maps and hash tables (Java Collections)

  • 1. Sets, maps and hash tables
  • 2. Long way to Ruzzle… 2 Tecniche di programmazione A.A. 2012/2013
  • 3. Sets Collection that cannot contain duplicate elements.
  • 4. Collection Family Tree 4 Tecniche di programmazione A.A. 2012/2013
  • 5. ArrayList vs. LinkedList ArrayList LinkedList add(element) IMMEDIATE IMMEDIATE remove(object) SLUGGISH IMMEDIATE get(index) IMMEDIATE SLUGGISH set(index, element) IMMEDIATE SLUGGISH add(index, element) SLUGGISH SLUGGISH remove(index) SLUGGISH SLUGGISH contains(object) SLUGGISH SLUGGISH indexOf(object) SLUGGISH SLUGGISH 5 Tecniche di programmazione A.A. 2012/2013
  • 6. Collection Family Tree 6 Tecniche di programmazione A.A. 2012/2013
  • 7. Set interface  Add/remove elements  boolean add(element)  boolean remove(object)  Search  boolean contains(object)  No positional Access! 7 Tecniche di programmazione A.A. 2012/2013
  • 8. Lists vs. Sets ArrayList LinkedList Set add(element) O(1) O(1) O(1) remove(object) O(n) + O(n) O(n) + O(1) O(1) get(index) O(1) O(n) n.a. set(index, elem) O(1) O(n) + O(1) n.a. add(index, elem) O(1) + O(n) O(n) + O(1) n.a. remove(index) O(n) O(n) + O(1) n.a. contains(object) O(n) O(n) O(1) indexOf(object) O(n) O(n) n.a. 8 Tecniche di programmazione A.A. 2012/2013
  • 9. Hash Tables A data structure implementing an associative array
  • 10. Notation  A set stores keys  U – Universe of all possible keys  K – Set of keys actually stored U K 10 Tecniche di programmazione A.A. 2012/2013
  • 11. Hash Table  Devise a function to transform each key into an index  Use an array K T[0..m] k h(∙) h(k) 11 Tecniche di programmazione A.A. 2012/2013
  • 12. Hash Function  Mapping from U to the slots of a hash table T[0…m–1] h : U  {0,1,…, m–1}  h(k) is the “hash value” of key k  “Any key should be equally likely to hash into any of the m slots, independent of where any other key hashes to” (Simple uniform hashing)  Compression/expansion  h N : U  N+ h(k) = hN(k) mod m  hR : U  [0, 1[  R h(k) =  hR(k) ∙ m  12 Tecniche di programmazione A.A. 2012/2013
  • 13. Hash Function - Complexity  Usually, h(k) = O(length(k))  length(k) « N  h(k) = O(1) 13 Tecniche di programmazione A.A. 2012/2013
  • 14. A simple hash function  h : A  N+  [0, m-1]  Split the key into its “component”, then sum their integer representation 𝑛  .hN 𝑘 = hN (𝑥0 𝑥1 𝑥2 … 𝑥 𝑛 ) = 𝑥𝑖 𝑖=0  h(k) = hN(k) % m 14 Tecniche di programmazione A.A. 2012/2013
  • 15. A simple hash (problems)  Problems  hN(“NOTE”) = 78+79+84+69 = 310  hN(“TONE”) = 310  hN(“STOP”) = 83+84+79+80 = 326  hN(“SPOT”) = 326  Problems (m = 173)  h(74,778) = 42  h(16,823) = 42  h(1,611,883) = 42 15 Tecniche di programmazione A.A. 2012/2013
  • 16. Collisions 0 U (universe of keys) h(k1) h(k4) K k1 k4 (actual k2 collision h(k2)=h(k5) keys) k5 k3 h(k3) m–1 16 Tecniche di programmazione A.A. 2012/2013
  • 17. Collisions  Collisions are possible!  Multiple keys can hash to the same slot  Design hash functions such that collisions are minimized  But avoiding collisions is impossible.  Design collision-resolution techniques  Search will cost Ө(n) time in the worst case  However, all operations can be made to have an expected complexity of Ө(1). 17 Tecniche di programmazione A.A. 2012/2013
  • 18. Hash functions  Simple uniform hashing  Hash value should be independent of any patterns that might exist in the data  No funneling 18 Tecniche di programmazione A.A. 2012/2013
  • 19. Natural numbers  An hash function may assume that the keys are natural numbers  When they are not, have to “interpret” them as natural numbers 19 Tecniche di programmazione A.A. 2012/2013
  • 20. Natural numbers hashing  Division Method (compression) h(k) = k mod m  Pros  Fast, since requires just one division operation  Cons  Have to avoid certain values of m  Good choice for m (recipe)  Prime  Not “too close” to powers of 2  Not “too close” to powers of 10 20 Tecniche di programmazione A.A. 2012/2013
  • 21. Natural numbers hashing  Multiplication Method I hR(k) = < k ∙ A > = (k ∙ A – k ∙ A) h(k) =  m ∙ hR(k)   Pros  Value of m is not critical (typically m=2p)  Cons  Value of A is critical  Good choice for A (Donald Knuth) 5−1  A== 2 21 Tecniche di programmazione A.A. 2012/2013
  • 22. Natural numbers hashing  Multiplication Method II h(k) = k ∙ 2,654,435,761  Pros  Works well for addresses  Caveat (Donald Knuth)  2,654,435,761 = 232 ∙  22 Tecniche di programmazione A.A. 2012/2013
  • 23. Resolution of collisions  Open Addressing  When collisions occur, use a systematic (consistent) procedure to store elements in free slots of the table  “Double hashing”, “linear probing”, …  Chaining 0 k1 k4  Store all elements that hash to the same slot in a linked list k5 k2 k6 k7 k3 k8 m–1 23 Tecniche di programmazione A.A. 2012/2013
  • 24. Chaining 0 U (universe of keys) h(k1)=h(k4) X k1 k4 K (actual k2 k6 X k5 h(k2)=h(k5)=h(k6) keys) k8 k7 k3 X h(k3)=h(k7) h(k8) m–1 24 Tecniche di programmazione A.A. 2012/2013
  • 25. Chaining 0 U (universe of keys) k1 k4 k1 k4 K (actual k2 k6 keys) k5 k5 k2 k6 k8 k7 k3 k7 k3 k8 m–1 25 Tecniche di programmazione A.A. 2012/2013
  • 26. Chaining (analysis)  Load factor α = n/m = average keys per slot  n – number of elements stored in the hash table  m – number of slots 26 Tecniche di programmazione A.A. 2012/2013
  • 27. Chaining (analysis)  Worst-case complexity: (n) ( + time to compute h(k) ) 27 Tecniche di programmazione A.A. 2012/2013
  • 28. Chaining (analysis)  Average depends on how h(∙) distributes keys among m slots  Let assume  Any key is equally likely to hash into any of the m slots  h(k) = O(1)  Expected length of a linked list = load factor = α = n/m  Search(x) = O(α) + O(1) ≈ O(1) 28 Tecniche di programmazione A.A. 2012/2013
  • 29. A note on iterators  Collection extends Iterable  An Iterator is an object that enables you to traverse through a collection (and to remove elements from the collection selectively)  You get an Iterator for a collection by calling its iterator() method public interface Iterator<E> { boolean hasNext(); E next(); void remove(); //optional } 29 Tecniche di programmazione A.A. 2012/2013
  • 30. Collection Family Tree 30 Tecniche di programmazione A.A. 2012/2013
  • 31. HashSet  Add/remove elements  boolean add(element)  boolean remove(object)  Search  boolean contains(object)  No positional Access  Unpredictable iteration order! 31 Tecniche di programmazione A.A. 2012/2013
  • 32. Collection Family Tree 32 Tecniche di programmazione A.A. 2012/2013
  • 33. LinkedHashSet  Add/remove elements  boolean add(element)  boolean remove(object)  Search  boolean contains(object)  No positional Access  Predictable iteration order 33 Tecniche di programmazione A.A. 2012/2013
  • 34. Costructors  public HashSet()  public HashSet(Collection<? extends E> c)  HashSet(int initialCapacity)  HashSet(int initialCapacity, float loadFactor) 34 Tecniche di programmazione A.A. 2012/2013
  • 35. Costructors  public HashSet()  public HashSet(Collection<? extends E> c)  HashSet(int initialCapacity)  HashSet(int initialCapacity, float loadFactor) 75% 35 Tecniche di programmazione A.A. 2012/2013
  • 36. JCF’s HashSet  Built-in hash function  Dynamic hash table resize  Smoothly handles collisions (chaining)  (1) operations (well, usually)  Take it easy! 36 Tecniche di programmazione A.A. 2012/2013
  • 37. Default hash function in Java  In Java every class must provide a hashCode() method which digests the data stored in an instance of the class into a single 32-bit value  In Java 1.2, Joshua Bloch implemented the java.lang.String hashCode() using a product sum over the entire text of the string 𝑛−1 ℎ 𝑠 = 𝑖=0 𝑠[𝑖] ∙ 31 𝑛−1−𝑖  But the basic Object’s hashCode() is implemented by converting the internal address of the object into an integer! 37 Tecniche di programmazione A.A. 2012/2013
  • 38. Understanding hash in Java public class MyData { public String name; public String surname; int age; } 38 Tecniche di programmazione A.A. 2012/2013
  • 39. Understanding hash in Java MyData foo = new MyData(); MyData bar = new MyData(); if(foo.hashCode() == bar.hashCode()) { System.out.println("FLICK"); } else { System.out.println("FLOCK"); } 39 Tecniche di programmazione A.A. 2012/2013
  • 40. Understanding hash in Java MyData foo = new MyData(); MyData bar = new MyData(); foo.name = "Stephane"; foo.surname = "Hessel"; foo.age = 95; bar.name = "Stephane"; bar.surname = "Hessel"; bar.age = 95; if(foo.hashCode() == bar.hashCode()) { System.out.println("FLICK"); } else { System.out.println("FLOCK"); } 40 Tecniche di programmazione A.A. 2012/2013
  • 41. Default hash function in Java public boolean equals(Object obj); public int hashCode();  If two objects are equal according to the equals() method, then hashCode() must produce the same result  If two objects are not equal according to the equals() method, performances are better whether the hashCode() produces different results 41 Tecniche di programmazione A.A. 2012/2013
  • 42. Hash functions in Java public boolean equals(Object obj); public int hashCode(); hashCode() and equals() should always be defined together 42 Tecniche di programmazione A.A. 2012/2013
  • 43. public class MyData public String name; public String surname; int age; public MyData() { } public MyData(String n, String s, int a) { name = n; surname = s; age = a; } […] 43 Tecniche di programmazione A.A. 2012/2013
  • 44. public class MyData @Override public boolean equals(Object obj) { if (obj == this) { return true; // quite obvious ;-) } if (obj == null || obj instanceof MyData == false) { return false; // not even comparable } // the real check! if(name.equals(((MyData)obj).name) == false || surname.equals(((MyData)obj).surname) == false) { return false; } return true; } […] 44 Tecniche di programmazione A.A. 2012/2013
  • 45. public class MyData @Override public boolean equals(Object obj) { if (obj == this) { return true; // quite obvious ;-) } if (obj == null || obj instanceof MyData == false) { The annotation @Override signals the compiler return false; // not even comparable } that overriding is expected, // the real has to fail if an override does not occur and that it check! if(name.equals(((MyData)obj).name) == false || surname.equals(((MyData)obj).surname) == false) { return false; } return true; } […] 45 Tecniche di programmazione A.A. 2012/2013
  • 46. public class MyData @Override public int hashCode() { String tmp = name+":"+surname; return tmp.hashCode(); } tmp will be “null:null” if MyData has not been initialized 46 Tecniche di programmazione A.A. 2012/2013
  • 47. Implementing your own hash functions  Grab your hash function from a professional 47 Tecniche di programmazione A.A. 2012/2013
  • 48. Trivial Hash Function This hash function helps creating predictable collisions (e.g., “ape” and “pea”) public long TrivialHash(String str) { long hash = 0; for(int i = 0; i < str.length(); i++) { hash = hash + str.charAt(i); } return hash; } 48 Tecniche di programmazione A.A. 2012/2013
  • 49. BKDR Hash Function This hash function comes from Brian Kernighan and Dennis Ritchie's book "The C Programming Language". It is a simple hash function using a strange set of possible seeds which all constitute a pattern of 31....31...31 etc, it seems to be very similar to the DJB hash function. public long BKDRHash(String str) { long seed = 131; // 31 131 1313 13131 131313 etc.. long hash = 0; for(int i = 0; i < str.length(); i++) { hash = (hash * seed) + str.charAt(i); } return hash; } 49 Tecniche di programmazione A.A. 2012/2013
  • 50. RS Hash Function A simple hash function from Robert Sedgwicks Algorithms in C book public long RSHash(String str) { int b = 378551; int a = 63689; long hash = 0; for(int i = 0; i < str.length(); i++) { hash = hash * a + str.charAt(i); a = a * b; } return hash; } 50 Tecniche di programmazione A.A. 2012/2013
  • 51. DJB Hash Function An algorithm produced by Professor Daniel J. Bernstein and shown first to the world on the usenet newsgroup comp.lang.c. It is one of the most efficient hash functions ever published public long DJBHash(String str) { long hash = 5381; for(int i = 0; i < str.length(); i++) { hash = hash * 33 + str.charAt(i); } return hash; } 51 Tecniche di programmazione A.A. 2012/2013
  • 52. JS Hash Function A bitwise hash function written by Justin Sobel public long JSHash(String str) { long hash = 1315423911; for(int i = 0; i < str.length(); i++) { hash ^= ((hash << 5) + str.charAt(i) + (hash >> 2)); } return hash; } 52 Tecniche di programmazione A.A. 2012/2013
  • 53. SDBM Hash Function This is the algorithm of choice which is used in the open source SDBM project. The hash function seems to have a good over-all distribution for many different data sets. It seems to work well in situations where there is a high variance in the MSBs of the elements in a data set. public long SDBMHash(String str) { long hash = 0; for(int i = 0; i < str.length(); i++) { hash = str.charAt(i) + (hash << 6) + (hash << 16) - hash; } return hash; } 53 Tecniche di programmazione A.A. 2012/2013
  • 54. DEK Hash Function An algorithm proposed by Donald E. Knuth in The Art Of Computer Programming (Volume 3), under the topic of sorting and search chapter 6.4. public long DEKHash(String str) { long hash = str.length(); for(int i = 0; i < str.length(); i++) { hash = ((hash << 5) ^ (hash >> 27)) ^ str.charAt(i); } return hash; } 54 Tecniche di programmazione A.A. 2012/2013
  • 55. DJB Hash Function The algorithm by Professor Daniel J. Bernstein (alternative take) public long DJBHash(String str) { long hash = 5381; for(int i = 0; i < str.length(); i++) { hash = ((hash << 5) + hash) ^ str.charAt(i); } return hash; } 55 Tecniche di programmazione A.A. 2012/2013
  • 56. PJW Hash Function This hash algorithm is based on work by Peter J. Weinberger of AT&T Bell Labs. The book Compilers (Principles, Techniques and Tools) by Aho, Sethi and Ulman, recommends the use of hash functions that employ the hashing methodology found in this particular algorithm public long PJWHash(String str) { long BitsInUnsigned = (long)(4 * 8); long ThreeQuarters = (long)((BitsInUnsigned *3) /4); long OneEighth = (long)(BitsInUnsigned / 8); long HighBits = (long)(0xFFFFFFFF) << (BitsInUnsigned - OneEighth); long hash = 0; long test = 0; […] 56 Tecniche di programmazione A.A. 2012/2013
  • 57. PJW Hash Function […] for(int i = 0; i < str.length(); i++) { hash = (hash << OneEighth) + str.charAt(i); if((test = hash & HighBits) != 0) { hash = (( hash ^ (test >> ThreeQuarters)) & (~HighBits)); } } return hash; } 57 Tecniche di programmazione A.A. 2012/2013
  • 58. ELF Hash Function Similar to the PJW Hash function, but tweaked for 32-bit processors. Its the hash function widely used on most UNIX systems public long ELFHash(String str) { long hash = 0, x = 0; for(int i = 0; i < str.length(); i++) { hash = (hash << 4) + str.charAt(i); if((x = hash & 0xF0000000L) != 0) hash ^= (x >> 24); hash &= ~x; } return hash; } 58 Tecniche di programmazione A.A. 2012/2013
  • 59. 59 Tecniche di programmazione A.A. 2012/2013
  • 60. Maps a.k.a, associative array, map, or dictionary
  • 61. Definition  In computer science, an associative array, map, or dictionary is an abstract data type composed of (key, value) pairs, such that each possible key appears at most once  Native support in most modern languages (perl, python, ruby, go, …). E.g., V1[42] = “h2g2” V2[“h2g2”] = 42  Implemented through hash tables 61 Tecniche di programmazione A.A. 2012/2013
  • 62. Java Collection Framework 62 Tecniche di programmazione A.A. 2012/2013
  • 63. Collection Family Tree 63 Tecniche di programmazione A.A. 2012/2013
  • 64. Map interface  Map<K,V>  K: the type of keys maintained by this map  V: the type of mapped values  Add/remove elements  value put(key, value)  value remove(key)  Search  boolean containsKey(key)  boolean containsValue(value) 64 Tecniche di programmazione A.A. 2012/2013
  • 65. Map interface containsValue() will probably  Map<K,V> require time linear in the map size  K: the type of keys maintained by this map for most implementations of the  V: the type of mapped values Map interface – ie. it is O(N)  Add/remove elements  value put(key, value)  value remove(key)  Search  boolean containsKey(key)  boolean containsValue(value) 65 Tecniche di programmazione A.A. 2012/2013
  • 66. Map interface (cont)  Nested Class  Map.Entry<K,V>  A map entry (key-value pair).  Set<Map.Entry<K,V>> entrySet()  Returns a Set view of the mappings contained in this map  Set<K> keySet()  Returns a Set view of the keys contained in this map  Collection<V> values()  Returns a Collection view of the values contained in this map 66 Tecniche di programmazione A.A. 2012/2013
  • 67. Collection Family Tree 67 Tecniche di programmazione A.A. 2012/2013
  • 68. LinkedHashMap  Maintains a doubly-linked list running through all of its entries  This linked list defines the iteration ordering (normally insertion-order)  Insertion order is not affected if a key is re-inserted 68 Tecniche di programmazione A.A. 2012/2013
  • 70. Java Collection Framework 70 Tecniche di programmazione A.A. 2012/2013
  • 71. 71 Tecniche di programmazione A.A. 2012/2013
  • 72. Licenza d’uso  Queste diapositive sono distribuite con licenza Creative Commons “Attribuzione - Non commerciale - Condividi allo stesso modo (CC BY-NC-SA)”  Sei libero:  di riprodurre, distribuire, comunicare al pubblico, esporre in pubblico, rappresentare, eseguire e recitare quest'opera  di modificare quest'opera  Alle seguenti condizioni:  Attribuzione — Devi attribuire la paternità dell'opera agli autori originali e in modo tale da non suggerire che essi avallino te o il modo in cui tu usi l'opera.  Non commerciale — Non puoi usare quest'opera per fini commerciali.  Condividi allo stesso modo — Se alteri o trasformi quest'opera, o se la usi per crearne un'altra, puoi distribuire l'opera risultante solo con una licenza identica o equivalente a questa.  http://creativecommons.org/licenses/by-nc-sa/3.0/ 72 Tecniche di programmazione A.A. 2012/2013