cd ~/Angga

Scaling Right with Consistent Hashing

October 26, 2025

Scalability

Architecture

Distributed Systems

Scaling a system isn't merely about adding more machines—it's about maintaining balance as demand grows. As new nodes come online or old ones retire, traffic must be distributed wisely to prevent overloads and idle nodes. The challenge lies in ensuring that every request consistently reaches the right destination, even as the system topology changes. Achieving that harmony between flexibility and stability is what defines truly scalable architecture.

The Scalability Dilemma

When systems scale horizontally, distributing requests sounds simple in theory—just hash a key and pick a server. But reality quickly proves otherwise. As instances scale up or down, that same hash function suddenly causes chaos: old mappings break, cached data becomes invalid, and load skews unpredictably. What once worked for a handful of servers can crumble when facing constant elasticity. This imbalance is the core dilemma—how to maintain consistent traffic distribution in an environment that never stays still.

Traditional Hashing Pitfall

A simple way to distribute load is by using a straightforward hashing function. You take a key—say, an event ID—run it through a hash function, and use the remainder of dividing by the total number of servers to decide where it goes.

Node 1Node 2Node 3ServerClienthash(key) % 3
Traditional Modulo Hashing.

This works fine—until the system changes. Suppose you add a new database or remove one due to maintenance. The total count in the modulo operation changes, and suddenly almost every key maps to a new location.

Node 1Node 2Node 3ServerClienthash(key) % 2
Suppose one node fails — we'll need to redistribute the data across the remaining instances until a new one comes online.

This means existing data must be redistributed across all nodes, not just the ones affected. As a result, caches break, lookups miss, and systems experience unnecessary churn. What seemed like a simple formula becomes a major bottleneck the moment your system tries to grow.

Redistribution with Consistent Hashing

Consistent hashing offers a more intuitive, clockwise way to spread data across nodes without turning every change into chaos. Instead of using a fixed modulo, it places both keys and nodes on a circular space—a hash ring—where each key simply moves clockwise until it lands on its assigned node.

Let's imagine a hash ring with 100 points and four database nodes placed at positions 0, 25, 50, and 75. To determine which database should handle a given operation, we still hash the key, but instead of using a modulo, we locate the hash value on the ring and move clockwise to find the nearest database node.

For example, an operation on hash(999) results in 7. On the ring, it falls between 0 and 25—thus assigned to node 2.

Node 1Node 3Node 2Node 4051015202530354045505560657075808590957
Hash ring moves clockwise to distribute data across nodes while minimizing data reshuffling.

At first, this may seem similar to traditional hashing, but the difference appears when nodes change. Suppose we add a fifth node at position 90. Only operations that hash to positions between 75 and 90 need to move—previously handled by node 1.

Node 1Node 3Node 2Node 5Node 405101520253035404550556065707580859095
Adding Node 5 affects only a small range of existing operations, keeping most data in place.

Unlike before, where nearly all data would be redistributed, consistent hashing moves only about 30% of node 1's data, roughly 15% of the total. In short, only a small portion of keys within the affected range are shifted, while most mappings remain untouched.

Now flip the situation, where node 3 at position 50 goes offline. Only keys within its segment (26-50) are reassigned—this time to Node 4 at position 75. Everything else stays right where it belongs.

Node 1Node 3Node 2Node 405101520253035404550556065707580859095
When Node 3 fails, only its assigned range (26-50) is remapped to Node 4, minimizing disruption.

That's the beauty of consistent hashing. The system bends but doesn't break. As nodes come and go, the data flow adjusts gracefully—no mass migrations, no sudden imbalances, just smooth, predictable scalability.

Spreading Evenly with Virtual Nodes

Even with consistent hashing, balance isn't always perfect. Some nodes may still end up holding more keys than others—especially when the number of nodes is small or the hash distribution isn't uniform. Over time, this imbalance can cause certain nodes to carry more load, while others sit half idle.

To fix this, consistent hashing introduces the concept of virtual nodes (or vnodes). Instead of mapping one physical node to a single point on the ring, each node is represented by multiple virtual positions. These vnodes act like smaller slices that collectively spread load more evenly across the ring.

For instance, if each database has 4 virtual nodes, node 1 might appear at positions 0, 20, 40, 65, and 80. Node 2 could occupy 15, 25, 45, and 65, and so on.

By having several virtual positions, each physical node receives a fair mix of keys from across the entire ring. This ensures that even if the hash function's randomness isn't perfect—the overall distribution stays smooth and predictable.

Node 1Node 1Node 1Node 1Node 1Node 3Node 3Node 3Node 3Node 3Node 2Node 2Node 2Node 2Node 2Node 4Node 4Node 4Node 4Node 405101520253035404550556065707580859095
Virtual nodes scatter each database's responsibility across the ring, reducing imbalance and hotspots.

In practice, this makes scaling out or in much more fluid. Adding a node doesn't cause large swings in data ownership, and removing one doesn't overload its neighbors. The system keeps humming along—balanced, steady, and ready to grow.

Beyond Theory

Consistent hashing powers more than databases—it shapes how modern infrastructure scales quietly beneath the surface. Content Delivery Networks (CDNs) use it to route requests to the nearest edge servers. When new servers come online or old ones are retired, only a small subset of users are remapped, preventing massive cache invalidations and keeping content delivery smooth across the globe.

The same principle anchors Apache Cassandra, which distributes data evenly across nodes using a hash ring. Each piece of data belongs to a specific token range, and when nodes join or leave, only their respective slices move to neighbors on the ring. This ensures that scaling out to handle more traffic—or recovering from a node failure—happens predictably, without reshuffling the entire dataset.

Takeaways

Consistent hashing transforms scaling from a disruptive process into a graceful one. By anchoring both data and nodes within a circular space, it ensures that growth, failure, or rebalancing only touch the minimal portion of the system—not the whole. This means faster recovery, less data shuffling, and smoother user experiences.

It also introduces fairness through virtual nodes, distributing load evenly across physical machines and preventing hotspots. Together, these properties make consistent hashing a quiet enabler of resilient distributed systems—powering databases, caches, and CDNs that scale without chaos. In essence, scaling right isn't about adding more nodes—it's about ensuring every change feels like nothing ever broke.