Stage 3: Replication
stage 3 of 3 · ~30 min · runs in your browser
Objective:Add a replication factor so the ring returns N distinct physical nodes for each key -- the exact algorithm Cassandra and DynamoDB use to determine which nodes store a piece of data.
Extend createRing(vnodes = 100, replicas = 3) with a new method:
getNodes(key) -- returns an array of up to replicas distinct physical
node IDs for the given key. Starting from the key's hash position, walk
clockwise around the ring (wrapping to index 0 at the end), collecting node IDs.
Skip vnode entries whose physical node is already in the result set. Stop when
you have collected replicas distinct nodes OR you have walked the entire ring
(fewer physical nodes than replicas).
getNode(key) -- keep this working. It should return the first physical
node from getNodes(key) (the primary replica).
With replicas=1, getNodes(key) returns an array of length 1 -- the same
result as getNode(key) wrapped in an array.
With fewer physical nodes than replicas, getNodes(key) returns all available
physical nodes without duplicating any.
The replication set is deterministic: every client that runs the same algorithm on the same ring state computes the same N nodes for a given key.
Why this matters in production
A single storage node is a single point of failure. Distributed databases solve this by storing each key on N physical nodes, chosen by walking clockwise around the ring from the key's hash position and collecting N distinct physical nodes. This is Cassandra's "replication strategy" and DynamoDB's "consistent hashing with replication." The ring tells you exactly which N nodes are responsible for a key -- no coordination, no lookup table, just deterministic arithmetic. The coordinator node writes to all N replicas and reads from a quorum of ceil(N/2)+1 to achieve both durability and read freshness. After this stage, you will understand why adding a replica in Cassandra requires only a ring topology change, and why removing a node in DynamoDB transfers responsibility cleanly to the next node in the replication set.
tests (6)
○ getNodes returns `replicas` distinct physical nodes
○ replicas=1 returns array of length 1 matching getNode
○ with fewer physical nodes than replicas, returns all available (no duplicates)
○ different keys map to different primary nodes
○ getNodes is deterministic across multiple calls
○ getNode returns the first node from getNodes