JSPYadvanced3 stages · 75 min

Build Your Own Consistent Hash Ring

Build the data structure that DynamoDB, Cassandra, and Discord use to distribute data across nodes without remapping every key when the cluster grows or shrinks -- a hash ring with virtual nodes, O(log N) lookup, and a mathematical guarantee that only 1/N keys migrate on any topology change.

Hash ring as sorted arrayBinary search for clockwise successorRing wrap-around (modular topology)Virtual nodes (vnodes)Load factor and 1/N migration guaranteeReplication factor NConsistent view without coordinationThundering herd on reshardingHot spots and uneven distributionChord protocol (distributed ring lookup)

Why build this

With naive `hash(key) % N`, adding or removing one server remaps (N-1)/N of your keys -- typically 80-90% of your entire dataset. Every remapped key is a cache miss or a database read that wasn't needed a second ago. At scale, this triggers a thundering herd: thousands of clients simultaneously recompute ownership and hammer the backing store. The result is latency spikes, cascading failures, and the kind of 3 AM outage that ends careers. Consistent hashing solves this with a mathematical guarantee. Place nodes and keys on a circular ring from 0 to 2^32. Each key belongs to the nearest clockwise node. When a node joins, it takes ownership of a contiguous arc of the ring -- only the keys in that arc migrate, roughly 1/N of the total. When a node leaves, its keys flow to the next clockwise node. No other nodes are affected. This is not an approximation or a heuristic; it is a provable bound that follows directly from the geometry of the ring. Virtual nodes (vnodes) close the final gap. With one position per physical node, the distribution depends heavily on hash collisions and the order nodes were added. With K virtual nodes per physical node (K=150 in Cassandra, K=100-200 in Memcached ketama), each physical node owns K non-contiguous arcs spread around the ring. The expected load per node converges to 1/N as K grows, and heterogeneous hardware can be modeled by giving more powerful nodes a higher K. After this track, you will understand why Cassandra's token ring, DynamoDB's consistent hashing layer, and Discord's voice channel routing all make the same architectural choice -- and you will be able to implement it from scratch in under 30 minutes.

What you'll build

A complete consistent hash ring in three stages, building from first principles to a production-ready implementation with replication. Stage 1 -- Basic Ring: The core data structure. The ring is a sorted array of { position, nodeId } entries. addNode hashes the node ID to a position and inserts in sorted order. getNode hashes the key and finds the first clockwise position with a binary search (wrapping to index 0 if the key falls past the last node). This stage makes the O(log N) lookup and 1/N migration properties concrete and testable. Stage 2 -- Virtual Nodes: Extend the ring so each physical node gets K positions, inserted as hash(nodeId + "#" + i) for i in 0..K-1. The same binary search finds the vnode owner, which maps back to the physical node. With K=100 and three nodes, distribute 1000 keys and verify that every node receives between 25% and 42% -- demonstrating that vnodes produce statistically even load even with small ring sizes. Stage 3 -- Replication: Add a replicas parameter and a getNodes(key) method that returns N distinct physical nodes -- the next N clockwise positions, deduplicated by physical node ID. This is exactly how Cassandra and DynamoDB determine which nodes store a key: the ring gives you the replication set, and the coordinator writes to all N and reads from a quorum.

How it works internally

The ring as a sorted array: The ring is not a circular linked list -- it is a flat array of { position, nodeId } entries sorted ascending by position. This makes binary search straightforward: given a key position, find the first entry with position >= keyPosition. If no such entry exists (the key falls past the last node), wrap around to index 0. The sort is maintained on every addNode call; removals use a filter. For rings with hundreds of vnodes this is more cache- friendly than a tree and faster in practice than a heap.

Binary search for the clockwise successor: Standard lower-bound binary search (lo=0, hi=ring.length-1, return lo when lo>hi). The target is the first index where ring[mid].position >= keyPosition. If the binary search falls off the end, return ring[0] (ring wrap-around). This gives O(log N) lookup where N is the total number of positions (physical nodes × vnodes).

Virtual nodes and the load factor math: With one position per node, the expected load is 1/N but variance is high -- with 3 nodes and adversarial hash inputs, one node can own 80% of the ring. With K vnodes per node, the ring has K×N entries and each node owns K non-contiguous arcs. By the law of large numbers, as K grows the per-node fraction concentrates around 1/N. Cassandra uses K=256 token positions; Memcached ketama uses K=160. In practice K=100 produces load within ±15% of 1/N for typical ring sizes.

The "consistent" in consistent hashing: The word "consistent" does not mean the hash function is consistent across calls (it is, but that is just determinism). It refers to a consistent view: every client that runs the same algorithm on the same ring state will compute the same node for a given key, without any coordination, locks, or server-side state lookup. The ring is a deterministic function, not a service. This is what makes it fast enough to call on every request in a hot path.

Replication factor N: In Cassandra and DynamoDB, a key is stored on the next N distinct physical nodes clockwise from its hash position. The coordinator node (the one that received the client request) writes to all N replicas and reads from a quorum of ceil(N/2)+1. The ring's getNodes(key) method returns exactly this list. Deduplication by physical node ID is required because with vnodes, the next N positions might map to fewer than N distinct physical nodes.

The 3 stages

Stage 1: Basic RingO(log N) key lookup with 1/N migration on topology change

Build the ring as a sorted array of { position, nodeId } entries. addNode hashes the node ID to a position and inserts in sorted order. getNode hashes the key and finds the first clockwise position with a binary search, wrapping to index 0 if the key falls past the last node. This stage makes the core guarantee concrete: removing one node only affects the keys in its arc.

Stage 2: Virtual NodesEven load distribution with K positions per physical node

Extend addNode to insert K positions per physical node using hash(nodeId + "#" + i) for i in 0..K-1. The same binary search finds the vnode owner, which maps to a physical node. With K=100 and three nodes, distributing 1000 keys should give each node between 25% and 42% -- demonstrating that vnodes eliminate the hot-spot problem of single-position rings.

Stage 3: ReplicationN-node replication set for fault tolerance

Add a replicas parameter and a getNodes(key) method that returns the next N distinct physical nodes clockwise from the key's position. This is the exact algorithm Cassandra and DynamoDB use to determine which nodes store a key. getNode(key) remains a convenience alias for getNodes(key)[0].

You'll see this in production

DynamoDB Partition Ring

DynamoDB distributes data across storage nodes using a consistent hash ring over virtual nodes. Each partition key is hashed to a position on the ring; the owning node (and its replicas) handle all reads and writes for that key. When DynamoDB adds capacity, only the keys in the new node's arc migrate -- the thundering herd problem that would occur with hash-mod-N is structurally impossible.

Cassandra Token Ring (vnodes)

Cassandra calls its ring positions "tokens." With virtual nodes enabled (num_tokens=256, the default since Cassandra 3.0), each physical node owns 256 non-contiguous token ranges. The replication factor R means Cassandra stores each row on the next R distinct nodes clockwise from its token. Your Stage 3 getNodes() is the exact algorithm Cassandra's coordinator uses to build the replica set for a write.

Memcached (ketama)

Libketama, written at Last.fm in 2007, was the first widely-adopted consistent hashing implementation for Memcached. It uses 160 virtual nodes per physical server (each one hashed with MD5). When a Memcached server is added or removed, only 1/N of keys need to be re-fetched from the origin -- without it, every cache miss during a scaling event would hit the database simultaneously.

Discord Voice Channel Routing

Discord routes voice channel sessions to voice servers using consistent hashing. When a voice server is added to handle load or removed for maintenance, only the sessions on that server's arc of the ring need to reconnect. The alternative -- a central routing table with a hash-mod-N assignment -- would require migrating all sessions on any topology change, causing a reconnect storm audible to users.

CDN Edge Server Selection

CDNs like Fastly and Cloudflare use consistent hashing to select which edge PoP caches a given URL. With hash-mod-N, adding a new PoP would invalidate cache entries across the entire CDN. With a consistent hash ring, only the URLs in the new PoP's arc need to be re-fetched from origin -- the rest stay cached at their current edge server.

Before you start

✓Hash functions: how MD5/SHA-1 map arbitrary strings to integers, and why the output is uniformly distributed
✓Modular arithmetic: understanding a circular number space (positions wrap from 2^32 back to 0)
✓Binary search: finding the first element >= a target in a sorted array in O(log N)
✓What sharding is: distributing data across multiple nodes by key, and why naive hash-mod-N breaks on resize

Stages

1Stage 1: The Basic Ring20 min 2Stage 2: Virtual Nodes25 min 3Stage 3: Replication30 min

Ready to build?

Your code runs in the browser — no setup needed.

Start Stage 1 →