Let's Implement Consistent Hashing From Scratch
Hello developers, I hope you are doing well. Recently, I coded a consistent hashing system, and in this post, I am going to talk about it.
What is Consistent Hashing?
Consistent hashing is a key distribution technique that ensures easy and smooth mapping of keys to servers to minimizing data movement when nodes are added or removed. Unlike traditional hashing methods, where adding or removing a server changes the hash distribution significantly, consistent hashing reduces this impact.
It goes with mapping both servers (nodes) and keys to a circular hash space. When a request comes in, the system moves clockwise along the ring to find the closest node, which will be responsible for that key. It will store key and value data and can also be used while retrieving the same.
Use Cases
Consistent hashing is widely used in distributed systems.
Load Balancing: Make sure that incoming requests are uniformly distributed among available servers so no server gets overwhelming resquests.
Distributed Caching (e.g., Memcached, Redis Cluster): Helps in mapping cache keys to specific nodes.
Database Sharding: Efficiently distributes and save database records across multiple database servers.
Why is Consistent Hashing Important?
Imagine you have a set of servers handling API requests. A traditional hash function could distribute these requests among servers, but as soon as a server is added or removed, the entire mapping breaks, and most of the data needs to be rebalanced. This causes cache misses that increases latency and unnecessary load on the system.
Consistent hashing solves this by ensuring that only a small fraction of keys need to be remapped when a server is added or removed. This makes the system highly scalable and resilient.
Implementation
Now, let’s talk about the implementation of consistent hashing in Golang. My implementation involves a ConsistentHashRing that maintains a sorted list of node hashes and efficiently assigns keys to nodes. Here’s how it works:
Base Structs:
type Node struct {
ID string
Keys map[string]string
}
type ConsistentHashRing struct {
mu sync.RWMutex
nodes map[uint32]*Node
hashes []uint32
}
func NewConsistentHashRing() *ConsistentHashRing {
return &ConsistentHashRing{
nodes: make(map[uint32]*Node),
hashes: []uint32{},
}
}
1. Hashing Function
I used Murmur3 as my hashing function because it provides better distribution and performance than FNV or MD5.
func hashFunction(key string) uint32 {
return murmur3.Sum32([]byte(key))
}
2. Adding Nodes to the Ring
When a new server is added, it is assigned a hash value and placed on the ring.
func (chr *ConsistentHashRing) AddNode(id string) {
chr.mu.Lock()
defer chr.mu.Unlock()
hash := hashFunction(id)
chr.nodes[hash] = &Node{
ID: id,
Keys: make(map[string]string),
}
chr.hashes = append(chr.hashes, hash)
slices.Sort(chr.hashes)
}
3. Finding the Nearest Node
To locate the closest node for a given key, I move clockwise along the sorted list of hashes.
func (chr *ConsistentHashRing) GetNextNodeIndex(hash uint32) int {
for i, h := range chr.hashes {
if h > hash {
return i
}
}
return 0 // Wrap around to the first node
}
4. Storing and Retrieving Data
Each node holds a set of keys. When a key-value pair is stored, it is mapped to the correct node.
func (chr *ConsistentHashRing) StoreKey(key, val string) {
node := chr.GetNode(key)
if node != nil {
node.Keys[key] = val
}
}
To retrieve a key:
func (chr *ConsistentHashRing) RetrieveKey(key string) (string, error) {
node := chr.GetNode(key)
if node == nil {
return "", errors.New("no node found")
}
val, ok := node.Keys[key]
if !ok {
return "", errors.New("key not found")
}
return val, nil
}
5. Handling Node Removal
When a server is removed, its keys must be transferred to the next available node.
func (chr *ConsistentHashRing) RemoveNode(id string) {
chr.mu.Lock()
defer chr.mu.Unlock()
hash := hashFunction(id)
node, exists := chr.nodes[hash]
if !exists {
return
}
nextNodeIndex := chr.GetNextNodeIndex(hash)
nextNode := chr.nodes[chr.hashes[nextNodeIndex]]
maps.Copy(nextNode.Keys, node.Keys)
delete(chr.nodes, hash)
for i, h := range chr.hashes {
if h == hash {
chr.hashes = slices.Delete(chr.hashes, i, i+1)
break
}
}
}
Printing Nodes Data:
func (chr *ConsistentHashRing) PrintRing() {
for _, h := range chr.hashes {
fmt.Printf("Node: %s \t\t Hash: %d \t\t Total Keys: %v\n", chr.nodes[h].ID, h, len(chr.nodes[h].Keys))
}
}
Source Code : GitHub
Also Read:
Final Thoughts
Consistent hashing is a powerful technique that improves load distribution in distributed systems. My implementation efficiently assigns keys to nodes and ensures minimal disruption when nodes are added or removed.
If you have suggestions to optimize this implementation, drop them in the comments. I am always looking to improve my code.
Thank you for reading, and don’t forget to subscribe if you want more deep-dive posts like this!