SlideShare a Scribd company logo
1 of 8
CloudFundoo 2012

       Distributed Hash Tables and Consistent
                                     Hashing
DHT(Distributed Hash Table) is one of the fundamental algorithms
used in distributed scalable systems; it is used in web caching, P2P
systems, distributed file systems etc.
First step in understanding DHT is Hash Tables. Hash tables need
key, value and a hash function, where hash function maps the key to a
location where the value is stored.

                   Keys              Hash Function            Stored Values




                Key1                                             Value3



                Key2                                             Value4



                Key3                                             Value1


                                                                 Value2
                Key4


                          value = hashfunc(key)

Python’s dictionary data type is implemented using hashing, see the
example below.


       1. #!/usr/bin/python
       2.
       3. dict = {'Name': 'Zara', 'Age': 11, 'Class': 'First'};
                                                                                         1
       4.
       5. dict['Age'] = 12;
       6. dict['School'] = "State School";
       7.
Copyright ©   CloudFundoo   | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
CloudFundoo 2012
        8.
        9. print "dict['Age']: ", dict['Age'];
        10.   print "dict['School']: ", dict['School'];



If we have a perfect hash function we will get an O (1) performance
i.e. constant time performance out of hash table while searching for a
(key, value) pair, this is because hash function distributes the keys
evenly across the table. One of the problem with hashing is it requires
lot of memory (or space) to accommodate the entire table, even if
most of the table is empty we need to allocate memory for entire
table, so there is waste of memory most of the time. This is called as
time-space tradeoff, hashing gives best time for search at the expense
of memory.
When we want to accommodate large number of keys (millions and
millions, say for the case of a cloud storage system), we will have to
divide keys in to subsets, and map those subsets of keys to a bucket,
each bucket can reside in a separate machine/node. You can assume
bucket as a separate hash table.

Distributed Hash Table
     Using buckets to distribute the (key, value) pair is called DHT.
A simple scheme to implement DHT is by using modulus operation
on key i.e. your hash function is key mod n, where n is the number of
buckets you have.
                                        Key Space
   K1                          Kn/3                      K2n/3                      Kn




                                                                                         2
               Bucket 1                  Bucket 2                   Bucket 3




Copyright ©   CloudFundoo   | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
CloudFundoo 2012

If you have 6 buckets then, key = 1 will go to bucket 1 since key % 6
= 1, key=2 will go to bucket 2 since key % 6 = 2 and so on. We will
need a second hashing to find the actual (key, value) pair inside a
particular bucket.
We can use two dictionaries to visualize DHT; here each row in
Client/Proxy dictionary is equivalent to a bucket in DHT.
                                                                      Bucket 1




                     Client/Proxy                                     Bucket 3




                                                                      Bucket 0




                                                                                         3

This scheme will work perfectly fine as long as we don’t change the
number of buckets. This scheme starts to fail when we add/remove
Copyright ©   CloudFundoo   | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
CloudFundoo 2012

buckets to/from the system. Lets add one more bucket to the system,
the number of buckets is now equal to seven, i.e. n=7. The key = 7
which was previously mapped to bucket 1 now map to bucket 0 since
key % 7 is equal to 0. In order to make it still work we need to move
the data between buckets, which is going to be expensive in this
hashing scheme.
Let’s do some calculation, consider modulo hash function,

                              h(key) = key mod n


Where n is the number of buckets, when we increase the number of
buckets by one, the hash function becomes

                            h(key) = key mod (n+1)


 Because of the addition of a new bucket, most of keys will hash to a
different bucket, let’s calculate the ratio of keys moving to different
bucket, K–n keys will move to a different bucket if keys are in the
range 0 to K, only the first n keys will remain in the same buckets. So
ratio of keys moving to a different bucket is

                               (K – n)/K = 1- n/K

If there are 10 buckets and 1000 keys, then 99% of keys will move to
a different bucket when we add another bucket. If we are using
python’s hash() or hashlib.md5 hashing functions, then the fraction of
keys moving to another bucket is
                                                                                         4
                                       n/(n +1)


Copyright ©   CloudFundoo   | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
CloudFundoo 2012

So we need a scheme to reduce the number of keys moving to a
different bucket, consistent hashing is a scheme for the same.

Consistent Hashing
A ring is the core of consistent hashing; first we hash the bucket IDs
to points on ring.
                                            B1




                        B4                                      B2




                                            B3



Then we hash the keys to ring, the resulting ring will look like below.

                                             B1


                                                          K1
                               K4




                        B3                                      B2




                               K3                          K2



                                             B3


                                                                                          5

So if we want to find the bucket which stores the value corresponding
to a key, we first need to hash the key to a point in that ring and then
Copyright ©   CloudFundoo    | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
CloudFundoo 2012

we need to search in the clockwise direction in the ring to find the
first bucket in that ring, that bucket will be the one storing the value
corresponding to the key. For key K1 value will be stored in bucket
B2, for key K2 value will be stored in bucket B3 and so on.
Hashing is working fine with this scheme, but we introduced this
scheme to handle addition/removal of buckets, let see how it handles
this, this is explained in below picture.



                                           B1


                              K4
                                                          K1




                       B3                                      B2




                              K3                        K2


                                           B3




So if we are removing bucket B3, key K2 seems to have a problem,
let’s see how consistent hashing solves this problem, key K2 still hash
to the same point in circle, while searching in the clockwise direction
it sees no bucket called B3, so searches past B3 in clockwise direction
and it will find bucket B4, where value corresponding to key K2 is
stored. For other keys there is no problem, all remains same, key K4
in bucket B1, key K1 in bucket B2 etc. So we need to move only the
contents of removed bucket to the clockwise adjacent bucket.
                                                                                         6
Let’s see what will happen if we add a bucket, see a slightly modified
diagram below.

Copyright ©   CloudFundoo   | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
CloudFundoo 2012



                                                           B1
                                           K5
                                                                     K1


                            K4



                                                                               B2
                        B3




                                 K3                                   K2



                                                     B3




The additional key K5 is mapped to B1, so we have both keys K4 and
K5 mapping to bucket B1, like bucket removal scenario where keys
K2 and K3 maps to bucket B4 after removal.


                                                K5              B1

                                  B5
                                                                          K1

                            K4


                                                                               B2
                        B3




                                                                      K2
                                      K3

                                                      B3




                                                                                         7
Let’s add a new bucket B5, the new bucket B5 goes in between keys
K4 and K5, key K4 which was previously mapped to bucket B1, now
Copyright ©   CloudFundoo   | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
CloudFundoo 2012

goes to bucket B5 and Key K5 still maps to bucket B1. So only the
keys which lie between B4 and B5 should be moved from B1 to B5.
On an average the fraction of keys which we need to move between
buckets when one bucket is added to the system is given as

                                      1/(n +1)


So by introducing consistent hashing we reduced the fraction of keys
which we need to move, from n/(n+1) to 1/(n+1), which is significant.
There is lot of details to consistent hashing, which is not covered in
this. Consistent hashing has a great role in distributed systems like
DNS, P2P, distributed storage, and web caching systems etc,
OpenStack Swift Storage and Memcached are open source projects
which use this to achieve scalability and Availability.



                                      <EOF>




                                                                                         8




Copyright ©   CloudFundoo   | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/

More Related Content

What's hot

Advanced Components on Top of L4Re
Advanced Components on Top of L4ReAdvanced Components on Top of L4Re
Advanced Components on Top of L4ReVasily Sartakov
 
I2C Subsystem In Linux-2.6.24
I2C Subsystem In Linux-2.6.24I2C Subsystem In Linux-2.6.24
I2C Subsystem In Linux-2.6.24Varun Mahajan
 
Cisco IPv6 Tutorial
Cisco IPv6 TutorialCisco IPv6 Tutorial
Cisco IPv6 Tutorialkriz5
 
QoS of WLAN (WiFi) - French
QoS of WLAN (WiFi) - FrenchQoS of WLAN (WiFi) - French
QoS of WLAN (WiFi) - FrenchAssia Mounir
 
Présentation etherchannel
Présentation etherchannelPrésentation etherchannel
Présentation etherchannelLechoco Kado
 
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなしOSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなしSatoshi Shimazaki
 
Audit et sécurité informatique
Audit et sécurité informatiqueAudit et sécurité informatique
Audit et sécurité informatiqueMohamed Habib Jomaa
 
Creating Your Own PCI Express System Using FPGAs: Embedded World 2010
Creating Your Own PCI Express System Using FPGAs: Embedded World 2010Creating Your Own PCI Express System Using FPGAs: Embedded World 2010
Creating Your Own PCI Express System Using FPGAs: Embedded World 2010Altera Corporation
 
Project ACRN Device Passthrough Introduction
Project ACRN Device Passthrough IntroductionProject ACRN Device Passthrough Introduction
Project ACRN Device Passthrough IntroductionProject ACRN
 
Travaux Dirigée: Equipements d'interconnexion
Travaux Dirigée: Equipements d'interconnexionTravaux Dirigée: Equipements d'interconnexion
Travaux Dirigée: Equipements d'interconnexionInes Kechiche
 
Chapter 3 link aggregation
Chapter 3   link aggregationChapter 3   link aggregation
Chapter 3 link aggregationJosue Wuezo
 

What's hot (20)

Nxll26 bgp ii
Nxll26 bgp iiNxll26 bgp ii
Nxll26 bgp ii
 
Advanced Components on Top of L4Re
Advanced Components on Top of L4ReAdvanced Components on Top of L4Re
Advanced Components on Top of L4Re
 
I2C Subsystem In Linux-2.6.24
I2C Subsystem In Linux-2.6.24I2C Subsystem In Linux-2.6.24
I2C Subsystem In Linux-2.6.24
 
Cisco IPv6 Tutorial
Cisco IPv6 TutorialCisco IPv6 Tutorial
Cisco IPv6 Tutorial
 
QoS of WLAN (WiFi) - French
QoS of WLAN (WiFi) - FrenchQoS of WLAN (WiFi) - French
QoS of WLAN (WiFi) - French
 
Présentation etherchannel
Présentation etherchannelPrésentation etherchannel
Présentation etherchannel
 
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなしOSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
OSC2012Kansai@Kyoto 自宅SAN友の会 - インフラエンジニアなら知っておきたい ストレージのはなし
 
Audit et sécurité informatique
Audit et sécurité informatiqueAudit et sécurité informatique
Audit et sécurité informatique
 
Mpls (3)
Mpls (3)Mpls (3)
Mpls (3)
 
Ext4 filesystem(1)
Ext4 filesystem(1)Ext4 filesystem(1)
Ext4 filesystem(1)
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
Cours Vlan
Cours VlanCours Vlan
Cours Vlan
 
OSPF v3
OSPF v3OSPF v3
OSPF v3
 
Creating Your Own PCI Express System Using FPGAs: Embedded World 2010
Creating Your Own PCI Express System Using FPGAs: Embedded World 2010Creating Your Own PCI Express System Using FPGAs: Embedded World 2010
Creating Your Own PCI Express System Using FPGAs: Embedded World 2010
 
Project ACRN Device Passthrough Introduction
Project ACRN Device Passthrough IntroductionProject ACRN Device Passthrough Introduction
Project ACRN Device Passthrough Introduction
 
Travaux Dirigée: Equipements d'interconnexion
Travaux Dirigée: Equipements d'interconnexionTravaux Dirigée: Equipements d'interconnexion
Travaux Dirigée: Equipements d'interconnexion
 
Chapter 3 link aggregation
Chapter 3   link aggregationChapter 3   link aggregation
Chapter 3 link aggregation
 
Cours3 ospf-eigrp
Cours3 ospf-eigrpCours3 ospf-eigrp
Cours3 ospf-eigrp
 
Aruba AP 270 Series Installation Guide
Aruba AP 270 Series Installation GuideAruba AP 270 Series Installation Guide
Aruba AP 270 Series Installation Guide
 
Vpn
VpnVpn
Vpn
 

Recently uploaded

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 

Recently uploaded (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Distributed Hash Table and Consistent Hashing

  • 1. CloudFundoo 2012 Distributed Hash Tables and Consistent Hashing DHT(Distributed Hash Table) is one of the fundamental algorithms used in distributed scalable systems; it is used in web caching, P2P systems, distributed file systems etc. First step in understanding DHT is Hash Tables. Hash tables need key, value and a hash function, where hash function maps the key to a location where the value is stored. Keys Hash Function Stored Values Key1 Value3 Key2 Value4 Key3 Value1 Value2 Key4 value = hashfunc(key) Python’s dictionary data type is implemented using hashing, see the example below. 1. #!/usr/bin/python 2. 3. dict = {'Name': 'Zara', 'Age': 11, 'Class': 'First'}; 1 4. 5. dict['Age'] = 12; 6. dict['School'] = "State School"; 7. Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
  • 2. CloudFundoo 2012 8. 9. print "dict['Age']: ", dict['Age']; 10. print "dict['School']: ", dict['School']; If we have a perfect hash function we will get an O (1) performance i.e. constant time performance out of hash table while searching for a (key, value) pair, this is because hash function distributes the keys evenly across the table. One of the problem with hashing is it requires lot of memory (or space) to accommodate the entire table, even if most of the table is empty we need to allocate memory for entire table, so there is waste of memory most of the time. This is called as time-space tradeoff, hashing gives best time for search at the expense of memory. When we want to accommodate large number of keys (millions and millions, say for the case of a cloud storage system), we will have to divide keys in to subsets, and map those subsets of keys to a bucket, each bucket can reside in a separate machine/node. You can assume bucket as a separate hash table. Distributed Hash Table Using buckets to distribute the (key, value) pair is called DHT. A simple scheme to implement DHT is by using modulus operation on key i.e. your hash function is key mod n, where n is the number of buckets you have. Key Space K1 Kn/3 K2n/3 Kn 2 Bucket 1 Bucket 2 Bucket 3 Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
  • 3. CloudFundoo 2012 If you have 6 buckets then, key = 1 will go to bucket 1 since key % 6 = 1, key=2 will go to bucket 2 since key % 6 = 2 and so on. We will need a second hashing to find the actual (key, value) pair inside a particular bucket. We can use two dictionaries to visualize DHT; here each row in Client/Proxy dictionary is equivalent to a bucket in DHT. Bucket 1 Client/Proxy Bucket 3 Bucket 0 3 This scheme will work perfectly fine as long as we don’t change the number of buckets. This scheme starts to fail when we add/remove Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
  • 4. CloudFundoo 2012 buckets to/from the system. Lets add one more bucket to the system, the number of buckets is now equal to seven, i.e. n=7. The key = 7 which was previously mapped to bucket 1 now map to bucket 0 since key % 7 is equal to 0. In order to make it still work we need to move the data between buckets, which is going to be expensive in this hashing scheme. Let’s do some calculation, consider modulo hash function, h(key) = key mod n Where n is the number of buckets, when we increase the number of buckets by one, the hash function becomes h(key) = key mod (n+1) Because of the addition of a new bucket, most of keys will hash to a different bucket, let’s calculate the ratio of keys moving to different bucket, K–n keys will move to a different bucket if keys are in the range 0 to K, only the first n keys will remain in the same buckets. So ratio of keys moving to a different bucket is (K – n)/K = 1- n/K If there are 10 buckets and 1000 keys, then 99% of keys will move to a different bucket when we add another bucket. If we are using python’s hash() or hashlib.md5 hashing functions, then the fraction of keys moving to another bucket is 4 n/(n +1) Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
  • 5. CloudFundoo 2012 So we need a scheme to reduce the number of keys moving to a different bucket, consistent hashing is a scheme for the same. Consistent Hashing A ring is the core of consistent hashing; first we hash the bucket IDs to points on ring. B1 B4 B2 B3 Then we hash the keys to ring, the resulting ring will look like below. B1 K1 K4 B3 B2 K3 K2 B3 5 So if we want to find the bucket which stores the value corresponding to a key, we first need to hash the key to a point in that ring and then Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
  • 6. CloudFundoo 2012 we need to search in the clockwise direction in the ring to find the first bucket in that ring, that bucket will be the one storing the value corresponding to the key. For key K1 value will be stored in bucket B2, for key K2 value will be stored in bucket B3 and so on. Hashing is working fine with this scheme, but we introduced this scheme to handle addition/removal of buckets, let see how it handles this, this is explained in below picture. B1 K4 K1 B3 B2 K3 K2 B3 So if we are removing bucket B3, key K2 seems to have a problem, let’s see how consistent hashing solves this problem, key K2 still hash to the same point in circle, while searching in the clockwise direction it sees no bucket called B3, so searches past B3 in clockwise direction and it will find bucket B4, where value corresponding to key K2 is stored. For other keys there is no problem, all remains same, key K4 in bucket B1, key K1 in bucket B2 etc. So we need to move only the contents of removed bucket to the clockwise adjacent bucket. 6 Let’s see what will happen if we add a bucket, see a slightly modified diagram below. Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
  • 7. CloudFundoo 2012 B1 K5 K1 K4 B2 B3 K3 K2 B3 The additional key K5 is mapped to B1, so we have both keys K4 and K5 mapping to bucket B1, like bucket removal scenario where keys K2 and K3 maps to bucket B4 after removal. K5 B1 B5 K1 K4 B2 B3 K2 K3 B3 7 Let’s add a new bucket B5, the new bucket B5 goes in between keys K4 and K5, key K4 which was previously mapped to bucket B1, now Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/
  • 8. CloudFundoo 2012 goes to bucket B5 and Key K5 still maps to bucket B1. So only the keys which lie between B4 and B5 should be moved from B1 to B5. On an average the fraction of keys which we need to move between buckets when one bucket is added to the system is given as 1/(n +1) So by introducing consistent hashing we reduced the fraction of keys which we need to move, from n/(n+1) to 1/(n+1), which is significant. There is lot of details to consistent hashing, which is not covered in this. Consistent hashing has a great role in distributed systems like DNS, P2P, distributed storage, and web caching systems etc, OpenStack Swift Storage and Memcached are open source projects which use this to achieve scalability and Availability. <EOF> 8 Copyright © CloudFundoo | cloudfundoo@gmail.com, http://cloudfundoo.wordpress.com/