Consistent Hashing (Part 1)

Photo by Crissy Jarvis on Unsplash

Consistent hashing is a well-known method for distributing stuff around multiple servers. We will implement a simple consistent hashing using Python and play a bit with it. This first part just focuses on having a way of computing hashes, we will build upon this in the later parts.

We will use a simple storage system for fixed-size objects as an example for our implementation. This storage system is composed of some servers, say \( k\), and each server \(i\) has a storage capacity \(C_i\). We will simplify our experiments by setting all \(C_i\)s to the same value.

Our implementation will receive blocks and assign them to servers, and then the assigned server stores the block. Even though the idea of using consistent hashing is to simplify the addition and removal of servers, we won’t implement that as part of the example in this post.

Our consistent hashing implementation will support two operations:

  1. Insert a new server with a given capacity.
  2. Given an object, determine to which server the object goes.

The way our implementation will work is as follows: we will assume a ring of 4B positions. When server \(i\) is inserted, we will insert \(C_i\) copies of \(i\) into the ring using a hash function (murmur3 in this case). 

When an object comes in, we hash the item into one of the 4B (billion) positions and walk from there to find the first position containing a server identifier. 

The code looks like this:

import sortedcontainers
import binascii
from collections import Counter
import mmh3

class ConsistentHash:
    def __init__(self):
        self.servers = 0
        self.ring = sortedcontainers.SortedDict()
    def hash(data):
        return mmh3.hash(data, signed=False)
    # inserts a new server with a given capacity. We make sure we
    # put capacity tokens for the server in the ring.
    def insert(self, server_id, capacity):
        k, i = 0, 0
        self.servers += 1
        while i < capacity:
            k += 1
            v = f'{server_id}:{k}'
            position = self.hash(v)
            if self.ring.get(position) is None:
                self.ring[position] = server_id
                i += 1
    # given a key, we get which server owns it (self-test, modify
    # this to return the next k servers from the location the key
    # hashes into). We will change this in part 2, as it 
    # complicates keeping things synchronized.           
    def assign(self, key):
        server_entry = self.ring.bisect_left(self.hash(f'{key}'))
        if server_entry == len(self.ring):
            return self.ring[self.ring.keys()[0]]
        return self.ring.peekitem(server_entry)[1]

If we create one sample hash with two servers as follows:

ch = ConsistentHash()
ch.insert(2, 5)
ch.insert(1, 5)

Then the ring looks like this (where tokens landed):

> SortedDict({1111226666: 2, 1268132435: 1, 1619890324: 1, 1621110414: 2, 2087267091: 1, 2340525146: 2, 2882217858: 2, 3107893247: 1, 3367333424: 1, 3926394000: 2})

Now, picking \(C_i=1\) is not a very good idea; our ranges tend to be quite biased. Let’s add a function that returns the number of slots assigned to each server to check how things balance out:

import sortedcontainers
import binascii
from collections import Counter
import mmh3

class ConsistentHash:
    def allocated_keys(self):
        counters = Counter()
        last = 0
        for p in sorted(self.ring.keys()):
            size = p - last
            counters[self.ring[p]] += size
            last = p
        counters[self.ring[self.ring.keys()[0]]] += (1 << 32) - last
        return counters

Here is how 5 servers look like with 1, 5, 10, and 15 points per server. As we add more tokens per server, we seem to balance things better.

And now, we will run the last experiment. We will plot the standard deviation over the mean (as a percentage) for the number of elements per bucket variating the capacity (the larger the capacity, the lower the standard deviation). We will consider capacities from 5 to 2,000. Note that the mean is constant, so the only variation is the standard deviation (I just find it easier to compare against the mean this way).

Be very careful with this plot, each size has 1 run! I’m just building it to build evidence for the intuition, but this is not close to being strong evidence. A good exercise is to modify the hashing function to use a seed set when building the hash, then multiple of these can be generated and used to validate properly. We will not go into that now; I want to wrap up by covering the idea of why we go through all the trouble.

Let’s complicate the setting:

  • Each server has \(c\) positions in the hash.
  • When an item comes in, we insert it into the next 3 servers in the ring. (Three is arbitrary here; you can tune it to your needs.)

If a new server comes in, we insert the server the following way: Hash into \(c\) positions. At each location we land, we know what 3 servers had objects from this new assigned range before. Get them from one of the servers and remove them from the third.

You have to be careful with edge cases like one of the hashed positions landing next to another location for the new server. But overall, it’s quite intuitive.

Now let’s say we have been removing objects for a while, and we realize we don’t need that many servers. In this case, we do the following: For each \(c\) positions for the server we are removing, send the objects to the thirds servers from them in the list.

If a server dies and we can’t recover it, we can replace it and fetch their content from the next 2 elements in the ring for each one of their tokens.

All this is quite a high level for the system using it, next part we will talk a bit more about how to maintain this table so that we are all in sync, and that we can keep serving while adding and removing servers.

This entry was posted in Programming. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.