Hash Ring Based Client-Side Load Balancing Decisions in Spring Boot

Apr 08, 2026

Client-side ring selection lets a client choose a backend in a repeatable way without handing that choice to a central balancer. The client hashes a stable request value, such as a tenant ID, session ID, account ID, or cache key, then maps it onto a ring where backends are placed through virtual points. The chosen target is the first backend clockwise from that hash position. If the backend set stays the same, the same request keeps going to the same backend. If one backend goes down, only that backend’s portion of the ring gets reassigned. That is why ring hashing is useful for sticky routing, caches, sharded reads, and similar traffic. Service discovery works nicely with that flow by giving the client an updated list of available instances so it can rebuild the ring from the active set.

Picking a Target on the Ring

Client-side ring selection starts with a stable request value and ends with one backend chosen from a shared ring. That sentence sounds compact, but a fair amount happens between those two points. A client needs a repeatable way to turn request data into a number, a repeatable way to place backend entries on the ring, and a repeatable way to walk that ring so the same request token keeps reaching the same backend while membership stays the same.

Most of the behavior people care about in a ring begins right here. Stable placement does not come from luck. It comes from feeding the same input into the same hash function, keeping the ring layout consistent across clients, and walking clockwise in the same way every time. Once those rules stay fixed, target selection becomes predictable enough to support sticky routing, cache locality, and shard-aware traffic without handing every request to a central chooser.

Request Hash Meets Ring Points

Everything starts with the request value you decide to hash. That value has to stay tied to the thing you want to keep on one backend. Session traffic usually hashes a session identifier. Tenant traffic usually hashes a tenant identifier. Cache traffic usually hashes the cache key itself. If the client hashes something that changes from one call to the next, placement changes with it, and the ring stops giving you repeatable target selection for that unit of traffic.

Each request hash is just a number until it is compared with the ring. Backends already have points placed around the circle, and the client looks for the first backend point at or after the request hash while moving clockwise. If the request hash lands near the end of the ring and no backend point appears after it, the client wraps back to the first point at the start of the ring. That wraparound step is part of the ring lookup, not a special second rule bolted on later.

Java’s NavigableMap is a handy fit for that lookup because it keeps points ordered. A request hash can be matched with ceilingEntry, and if that returns null, the client wraps to firstEntry:

import java.util.NavigableMap;
import java.util.TreeMap;
import java.util.Map;

public final class RingLookup {
    private final NavigableMap<Long, String> ring = new TreeMap<>();

    public RingLookup() {
        ring.put(102L, "backend-east-1");
        ring.put(340L, "backend-west-2");
        ring.put(615L, "backend-east-2");
        ring.put(910L, "backend-central-1");
    }

    public String pick(long requestPoint) {
        Map.Entry<Long, String> entry = ring.ceilingEntry(requestPoint);
        if (entry != null) {
            return entry.getValue();
        }
        return ring.firstEntry().getValue();
    }
}

The lookup rule stays the same no matter how the request point was produced. What changes is the request value chosen before hashing. That part deserves care because it decides what kind of stickiness the client gets. Hashing a tenant ID keeps one tenant near one backend. Hashing a shopping cart ID keeps one cart near one backend. Hashing a per-request timestamp would make the ring behave almost randomly for that traffic, which defeats the purpose of ring-based selection in the first place.

Picking the hash input also affects how evenly request ownership spreads across the ring. If almost every request shares the same small set of identifiers, the ring may still be behaving exactly as written while traffic piles onto a narrow slice of backends. That is not a ring failure. That is a property of the request data. Good placement starts with a request token that has enough variety for the traffic you are routing.

This gets easier to understand with a small helper method that picks the token based on the kind of placement you want to preserve:

public final class RequestTokenSelector {

    public String selectToken(ClientRequest request) {
        if (request.sessionId() != null && !request.sessionId().isBlank()) {
            return request.sessionId();
        }
        if (request.tenantId() != null && !request.tenantId().isBlank()) {
            return request.tenantId();
        }
        return request.accountId();
    }
}

record ClientRequest(String sessionId, String tenantId, String accountId) {}

Notice what this code is really doing. It is deciding what should stay attached to one backend. That choice comes before any ring lookup happens, and it has a lasting effect on the traffic profile the client creates.

Placement on the ring itself also needs to be identical wherever the client logic runs. If one client builds backend points in a different order, hashes backend identifiers differently, or skips some points another client included, two clients can send the same request token to different backends. Ring hashing depends on shared rules, not on one lucky calculation. Same request token, same backend membership, same hashing steps, same ring walk. That is the chain that makes target selection repeatable.

Stable Placement Through Virtual Points

One ring point per backend sounds fine at first, but it usually produces rough ownership slices. Some backends end up with wide arcs, others with narrow ones, and traffic share drifts more than people expect. Virtual points fix that by placing each backend on the ring multiple times instead of just once. Each placement claims a small arc, and the combined arcs for that backend add up to its total share.

More points usually mean better balance across the circle, but that does not mean the ring becomes mathematically perfect. Hashing still spreads points based on the hash results, so some variation remains. What changes is the scale of that variation. With a thin ring, one unlucky placement can give a backend too much space. With a fuller ring, ownership is broken into smaller slices, which pulls the final share closer to what the client intended.

Weighted rings handle this by turning weight into repeated placements around the ring. If one backend is meant to carry about twice the share of another, it usually gets about twice as many virtual points. That means the client is not keeping a single backend entry and attaching extra traffic to it later. It is giving that backend more positions on the ring from the start, which gives it a larger share of ownership during lookup.

To see how that looks in code, a ring builder can place the same backend on the ring multiple times based on its weight:

import java.nio.charset.StandardCharsets;
import java.util.Map;
import java.util.NavigableMap;
import java.util.TreeMap;
import java.util.zip.CRC32;

public final class WeightedRingBuilder {
    private final NavigableMap<Long, String> ring = new TreeMap<>();

    public WeightedRingBuilder(Map<String, Integer> weights, int pointsPerWeightUnit) {
        for (Map.Entry<String, Integer> entry : weights.entrySet()) {
            String backendId = entry.getKey();
            int weight = entry.getValue();

            for (int i = 0; i < weight * pointsPerWeightUnit; i++) {
                long point = hash32(backendId + "#" + i);
                ring.put(point, backendId);
            }
        }
    }

    public NavigableMap<Long, String> ring() {
        return ring;
    }

    private long hash32(String value) {
        CRC32 crc32 = new CRC32();
        crc32.update(value.getBytes(StandardCharsets.UTF_8));
        return crc32.getValue();
    }
}

The backendId + "#" + i part is what creates a different hash input for the same backend. Without that suffix, repeated hashing of the same backend identifier would keep landing on the same point, which would defeat the whole point of virtual placement. Each numbered entry usually lands at a different location on the ring, though hash collisions can still map two entries to the same point.

Traffic ownership gets easier to follow when the ring is viewed as a long run of small ownership slices rather than a single slice for each backend. The request token hashes to one location, then the client moves clockwise until it reaches the first virtual point. That point belongs to a backend, so that backend gets the request. Distance to the backend’s next virtual point does not change that result. If the same request token comes back later, it hashes to the same location and reaches that same winner again.

To make that easier to see in code, take this inspection helper that can count how those virtual points are distributed across the ring:

import java.util.Map;
import java.util.HashMap;
import java.util.NavigableMap;

public final class RingOwnershipCounter {

    public Map<String, Integer> countPoints(NavigableMap<Long, String> ring) {
        Map<String, Integer> totals = new HashMap<>();

        for (String backendId : ring.values()) {
            totals.merge(backendId, 1, Integer::sum);
        }

        return totals;
    }
}

Counting ring entries does not tell you the exact request share by itself, but it does give a quick read on how virtual placement was distributed. If a backend was meant to carry twice the traffic of another backend, its point count should usually be about twice as large as well. From there, the actual arc lengths still depend on where those points landed after hashing.

Ring size affects how placement behaves in practice too. Sparse rings with very few virtual points can create large arcs, so small changes in point placement carry more weight. Fuller rings break ownership into smaller arcs, which keeps distribution closer to the intended shares. That is why virtual points are not just an extra tuning choice. They are part of what makes ring-based target selection usable at scale.

Stable placement through virtual points also depends on backend identity staying consistent. If one refresh names a backend orders-7 and the next refresh names that same backend 10.1.4.23, the client hashes two different identifiers and places them at different positions. Membership may not have changed in any meaningful way, yet the ring would still be rearranged. Consistent backend identity is directly tied to consistent point placement.

Also, virtual points do not change the basic lookup rule. The request still hashes to one location, and the client still walks clockwise. What changes is the texture of the ring. Instead of a small number of wide ownership regions, the ring becomes a larger field of smaller regions that map back to backend identities in a more balanced way. That is what gives ring hashing its stable feel while keeping target selection repeatable enough for sticky traffic.

Failure Handling With Discovery

Routing on a ring gets more interesting the moment backend health starts changing. The request token still hashes to the same location, yet the client can no longer treat every backend on the ring as eligible. Discovery data and local health feedback are the two inputs that guide that decision. Discovery supplies a shared view of active membership, while local feedback lets the client react right after a connection or request fails. That split keeps the flow readable. Ring lookup still decides who owns traffic for a given token. Discovery and health status decide who is allowed to participate in that lookup at the current moment. From there, failover behavior, remap size, and refresh timing all follow from that same idea.

Failover After a Dead Node

Failover on a hash ring behaves very differently from modulo-based routing. With modulo hashing, a membership change can reshuffle a large share of placements because the divisor changes. On a ring, the token still hashes to the same location, and the client still moves clockwise from that point. What changed is the set of live backends. If a backend is gone, the client keeps moving until it reaches the next backend that is still eligible to receive traffic.

That part is what makes ring failover fairly intuitive once the lookup rule is already familiar. The token does not need to be reinterpreted during failure. No replacement hash is computed. No separate failover map is required. Traffic that used to stop at the dead backend now continues forward to the next live backend, while traffic that belonged elsewhere keeps the same destination.

Client implementations usually handle that in two broad ways. Some clients rebuild the active ring after discovery marks a backend unhealthy, then run lookup against that smaller ring. Others keep the full ring in memory for a short period and mark a failing backend as temporarily ineligible right after local errors such as connection refusal, TLS handshake failure, or repeated short timeouts. In both cases, the client still follows the same clockwise walk. The only difference is which source marked the backend unavailable.

This selector keeps that behavior visible in code:

import java.time.Instant;
import java.util.Map;
import java.util.NavigableMap;

public final class FailoverSelector {
    private final NavigableMap<Long, String> ring;
    private final Map<String, Instant> locallyBlockedUntil;

    public FailoverSelector(NavigableMap<Long, String> ring, Map<String, Instant> locallyBlockedUntil) {
        this.ring = ring;
        this.locallyBlockedUntil = locallyBlockedUntil;
    }

    public String pick(long requestPoint) {
        for (String backend : ring.tailMap(requestPoint, true).values()) {
            if (isUsable(backend)) {
                return backend;
            }
        }

        for (String backend : ring.headMap(requestPoint, false).values()) {
            if (isUsable(backend)) {
                return backend;
            }
        }

        throw new IllegalStateException("No live backend available");
    }

    private boolean isUsable(String backend) {
        Instant blockedUntil = locallyBlockedUntil.get(backend);
        return blockedUntil == null || Instant.now().isAfter(blockedUntil);
    }
}

Nothing about the ring lookup changed in that class. The client still starts from the token’s ring position and still scans clockwise with wraparound. Local blocking only changes which backend can stop that scan.

Timing is the part that makes this worth calling out. Discovery data is never perfectly instantaneous. A backend can fail after the last refresh and still remain present in the client’s current membership view for a short period. During that gap, local feedback lets the client stop routing traffic to a backend that has already started failing. Later, discovery catches up and removes that backend from the shared active set.

That local suppression window belongs to client policy rather than to the ring itself. Short windows let a backend return to traffic sooner after a brief network issue. Longer windows reduce repeated failures to the same destination, though they also keep a recovered backend out of local rotation for more time. The ring defines the walk and health policy defines which stops on that walk are currently acceptable.

How Much Traffic Moves

Consistent hashing reduces remapping, but it does not remove it. If one backend disappears, the tokens that used to belong to that backend move forward to the next live backend on the ring. Tokens that mapped to other backends stay where they were. That is the practical meaning of minimal remapping in ring-based routing.

For equal-share backends on a ring with decent partitioning, removing one backend from a set of N usually moves about 1/N of the tokens. Ten equal backends means roughly one tenth of the traffic moves after one loss. With a hundred equal backends, roughly one hundredth of the traffic moves. That estimate depends on the ring being partitioned well enough that backend ownership is close to the intended shares.

Smaller rings can drift from that target. If virtual points are sparse or bunched unevenly, one backend can own more arc length than intended, which means its failure moves more traffic than the equal-share estimate would suggest. That is why virtual point count and ring size have such a strong effect on observed remap size. They do not alter the clockwise lookup rule, but they do affect how evenly ownership is spread before anything fails.

Weighted backends follow the same logic. If a backend owns a larger share of the ring, losing that backend moves a larger share of traffic. No special outage formula is needed for that case. The amount of traffic that moves follows directly from how much ring ownership that backend had before it dropped out.

The same idea appears during scale-out. When a new backend joins, it takes over part of the ring from the current backends rather than forcing a full reshuffle. That is why ring-based routing is so useful for sticky traffic. Session tokens, tenant tokens, and shard tokens keep their placement unless the slice of the ring tied to that token changes ownership.

Remapping is easiest to follow when viewed as an ownership transfer around the ring. One backend loses its region, and neighboring live regions absorb it clockwise. The client is not rebalancing every token across the full backend set after a failure. It is handing off the failed backend’s portion to the next reachable owners on the circle.

Discovery Feeds the Active Set

Discovery answers a different question from the ring. The ring decides where traffic should go among the current candidates. Discovery decides which candidates belong in that active set at all. Keeping those two jobs separate makes the full flow much easier to reason about.

Kubernetes is a good example of that split. Backend membership for a Service is tracked through EndpointSlice objects. Clients that read those active endpoints can rebuild their ring from the current healthy membership instead of relying on a stale address list. The ring lookup itself does not change. What changes is who gets placed on the ring before lookup starts.

Consul follows the same overall model through a different API surface. Services register there, health checks update their status, and clients can read the healthy service view to populate the active backend set. The ring does not perform those health checks by itself. It consumes the active membership that discovery and health evaluation already produced.

This adapter keeps that handoff visible:

import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;

public final class ActiveRingFactory {

    public WeightedRingBuilder fromDiscovery(List<ServiceInstance> instances, int pointsPerWeightUnit) {
        Map<String, Integer> weights = new LinkedHashMap<>();

        for (ServiceInstance instance : instances) {
            if (instance.healthy()) {
                weights.put(instance.instanceId(), instance.weight());
            }
        }

        return new WeightedRingBuilder(weights, pointsPerWeightUnit);
    }
}

record ServiceInstance(String instanceId, boolean healthy, int weight) {}

That class does not perform discovery on its own. Its job is narrower. It takes membership data that discovery already returned, keeps only the healthy entries, and turns that active set into ring membership.

Clients that skip discovery can still build a ring, but membership then becomes fixed or manually maintained. That gets fragile in environments where instances appear and disappear, addresses change, or health status changes faster than configuration files do. Discovery without ring-based placement has the opposite weakness. The client knows who is active, but it still needs a repeatable rule for choosing one destination for a given token. Put those two pieces side by side, and the client gets live membership with stable placement.

Backend identity also needs to stay consistent across refreshes. If discovery reports the same backend under different identifiers from one update to the next, the client hashes different backend names and places them at different ring positions. Membership may be effectively the same, yet placement will still move because the ring sees those identifiers as different entries.

Refresh Windows Plus Local Health

Refresh timing decides how quickly discovery changes alter the ring, while local health decides what the client does between those refreshes. Those inputs are related, but they are not the same thing.

Shared discovery data gives the client a broader membership view. Local health gives the client the earliest sign that something just failed in front of it. Healthy routing flow uses both. Discovery is useful for giving the fleet a common view of membership. Local failure feedback is useful right after an outbound call starts failing.

Let’s see how that looks with a small tracker that keeps a short local quarantine window for failing backends:

import java.time.Duration;
import java.time.Instant;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;

public final class LocalHealthTracker {
    private final ConcurrentMap<String, Instant> blockedUntil = new ConcurrentHashMap<>();
    private final Duration blockTime = Duration.ofSeconds(20);

    public void markFailure(String backendId) {
        blockedUntil.put(backendId, Instant.now().plus(blockTime));
    }

    public void markSuccess(String backendId) {
        blockedUntil.remove(backendId);
    }

    public boolean isBlocked(String backendId) {
        Instant until = blockedUntil.get(backendId);
        return until != null && Instant.now().isBefore(until);
    }
}

Local quarantine does not replace discovery membership. It covers the gap between refreshes. If a backend starts failing right now, the client does not have to wait for the next registry update before reacting.

Refresh rate also affects stickiness and traffic churn. Long refresh windows can leave dead nodes on the ring longer than they should stay there, which leads to repeated local failovers before discovery catches up. Extremely short refresh windows can rebuild the ring too aggressively and move traffic more frequently than needed. The best interval depends on how quickly backend state changes, how fast the discovery source updates health, and how expensive extra remaps are for the traffic being routed.

Not every discovery update needs a rebuild either. If fresh discovery data arrives without any change to active membership, rebuilding the ring serves little purpose and still adds churn inside the client.

This small snapshot check helps keep rebuilds tied to actual membership changes:

import java.util.List;
import java.util.Objects;

public final class MembershipSnapshot {
    private final List<String> activeBackendIds;

    public MembershipSnapshot(List<String> activeBackendIds) {
        this.activeBackendIds = activeBackendIds.stream()
                .sorted()
                .toList();
    }

    public boolean sameMembership(MembershipSnapshot other) {
        return Objects.equals(this.activeBackendIds, other.activeBackendIds);
    }

    public List<String> activeBackendIds() {
        return activeBackendIds;
    }
}

Two health views are active through this whole flow. One comes from discovery and reflects the broader shared membership picture. The other comes from the client’s own request outcomes. Discovery can still report a backend as present while the client has already seen repeated failures to that same backend. Local suppression handles that near-term gap. After discovery removes the backend from the active set, the rebuilt ring no longer includes it as a candidate. That sequence fits naturally with ring routing. Local health reacts first, and discovery membership turns that temporary decision into shared ring membership on the next rebuild.

Conclusion

Ring-based client-side load balancing comes down to a repeatable routing flow. The client hashes a stable request value, moves clockwise to the matching backend point, and keeps sending that traffic to the same place while membership stays unchanged. Virtual points spread ownership across the ring more evenly, service discovery keeps the active backend set current, and local health checks let the client step past failing nodes before the next refresh arrives. When a backend drops out or a new one joins, only the affected slice of ring ownership is reassigned, which is why this model fits sticky routing, shard-aware traffic, and cache-heavy traffic so well.

Share Alexander Obregon's Substack

Alexander Obregon's Substack

Discussion about this post

Ready for more?