Duplicate submits can damage API data faster than people expect. Someone taps Pay again after a timeout, a mobile client retries after the network drops, or a browser refresh sends the same form submission back to the server. Without replay protection, that same request can create two orders, charge the same card twice, or store the same record more than one time. HTTP already treats GET, HEAD, PUT, DELETE, and OPTIONS as idempotent, but POST and PATCH do not get that behavior by default. An Idempotency-Key header gives the server a way to treat repeated POST or PATCH requests as the same logical operation instead of brand-new submissions. The client sends a fresh value for a new create or update request, then sends that value again only when retrying that exact request. From there, the server keeps enough request data to tell a first arrival from a replay, then either runs the operation, returns the stored response, or rejects the request if the payload no longer matches.
How Idempotency Fits an API
At the API boundary, idempotency is really a request identity problem. We are giving a write operation a stable identifier that survives retries, refreshes, and connection loss so the server can map repeated arrivals back to the same business action. That identifier cannot live as a header value by itself. It needs caller scope, endpoint scope, a fingerprint of the business payload, and stored state that lets us tell first submission, replay of a completed request, and reuse of the same idempotency value for different input apart.
From there, the contract becomes mechanical. A client creates a new Idempotency-Key for a new POST or PATCH, keeps it unchanged for retries of that exact request, and stops reusing it after that operation is no longer being retried. The server stores that scoped request identity with the payload fingerprint, processing state, response data, and expiration time. With that record in place, duplicate traffic stops being a judgment call. We can return the stored result for a matching retry, reject a payload collision, or report that the first request is still in flight.
The Request That Should Happen One Time
Retries sit at the center of idempotency. We can send a payment request, have the server finish it, and still lose the response because the connection breaks at the wrong moment. From the client side, that failed response and a failed write can look exactly the same. If we retry with the same Idempotency-Key, the API can tie that second arrival back to the first logical action rather than treating it as a second payment.
The first submission can look like this:
POST /payments HTTP/1.1
Content-Type: application/json
Idempotency-Key: pay-7f2b91e4-9b61-4d7c-a9f8-8c5a41f20f11
{
"accountId": "acct_9031",
"amount": 125.00,
"currency": "USD"
}That request is meant to create a single payment. If the client never receives the response, the retry must carry the same header value and the same business payload. A fresh header value tells the server this is a fresh action. Reusing the original value tells the server we are retrying the same action.
Retry traffic for that request would look like this:
POST /payments HTTP/1.1
Content-Type: application/json
Idempotency-Key: pay-7f2b91e4-9b61-4d7c-a9f8-8c5a41f20f11
{
"accountId": "acct_9031",
"amount": 125.00,
"currency": "USD"
}What makes this useful is not the repeated transport itself. Networks fail, clients retry, and users click submit again when confirmation does not appear. Idempotency gives the API a stable way to say that both arrivals belong to the same intended write. The server does not need to trust timing, client memory, or user patience. It can rely on the repeated request identity instead.
Client behavior still needs boundaries. We should create a new Idempotency-Key value for every new write request, then keep that value attached only to retries of that same request. Reusing it for a different business action creates a collision. Sending a new value during a retry removes the link back to the original write. Both mistakes turn a replay guard into dead weight, because the server can only make a sound decision when the request identity stays stable across legitimate retries.
The Record the Server Keeps
Stored replay data is what turns that header into actual behavior. We need enough saved information to answer a narrow question with confidence. Has this caller already sent this same logical action within the active replay window. To answer that, the server usually stores the caller scope, the operation or endpoint identity, the Idempotency-Key value, a fingerprint of the request payload, a processing state such as PENDING or COMPLETED, the response status, the saved response or a reference to it, and an expiration timestamp.
Stored replay data can look like this:
{
"clientId": "merchant_1842",
"operation": "POST /payments",
"idempotencyKey": "pay-7f2b91e4-9b61-4d7c-a9f8-8c5a41f20f11",
"fingerprint": "sha256:8c13c8f77d8d5c8b2d4dcb76f19d8c8f9b7b2f2a9a0b5f2f76b61d0b5d33f6aa",
"state": "COMPLETED",
"responseStatus": 201,
"responseBody": {
"paymentId": "pmt_50192",
"status": "APPROVED"
},
"expiresAt": "2026-05-13T19:40:00Z"
}Fingerprinting keeps the replay check honest. If the same caller sends the same Idempotency-Key again with the same payload fingerprint, we can treat that request as a replay and return the stored result. If that same header value comes back with different business data, the server should reject it instead of quietly tying two different actions to the same request identity. That keeps the header from turning into a blank permission slip for any later payload.
Processing state fills in the rest of the story. PENDING means the first request claimed that request identity but has not finished yet. COMPLETED means the write already finished and the response can be replayed. That split lets the API tell the difference between a retry that arrived too early and a retry that arrived after the original write was done. Without saved state, every duplicate arrival would force the server to infer what stage the first request had reached.
Response status and expiration time belong in the same record for practical reasons. Stored status lets the server replay the same outcome rather than inventing a new answer on the second trip through. The expiration timestamp defines how long that protection stays active. If the replay window is too short, a late retry can slip through as a fresh write. If the window is much longer than the business need, replay rows can hang around long after they stop helping. Publishing that retention window gives callers a stable rule for how long a retry can still map back to the original request.
Status codes fit naturally into that stored-record context too. Missing required idempotency input usually maps to 400 Bad Request. Matching retry traffic that arrives while the first request is still running fits 409 Conflict. Reuse of the same Idempotency-Key value with a different payload fingerprint fits 422 Unprocessable Content. Those responses let us separate missing input, in-flight replay, and payload collision into distinct API outcomes instead of collapsing everything into a vague failure.
The Spring Boot Request Flow
Within a Spring Boot API, idempotency becomes a sequence we place at specific points in request handling. We read the header before any write runs, claim the request identity before the business mutation goes through, replay the stored outcome for matching retries, and retire expired rows after their retention window closes. Spring MVC gives us the hooks for that flow through servlet filters, controller methods, transactional services, structured error responses, and scheduled jobs.
Reading the Header Early
Near the front of request handling, we want to inspect Idempotency-Key before controller code changes anything. If only a couple of endpoints need idempotency, pulling the header in a controller method can be enough. Once that rule applies across a broader set of POST and PATCH handlers, a servlet filter gives us a single place to validate the header, attach request metadata, and stop bad traffic before domain logic runs.
For that early step, OncePerRequestFilter fits well because it runs at the servlet layer before the request reaches controller code. That gives us a stable spot to reject missing headers, keep the validated value on the request, and leave the business method free to focus on the mutation itself. If later code needs the same value, pulling it from a request attribute is cleaner than rereading the raw header at every layer.
This is a filter that checks the header and stores it for later use:
package com.example.idempotency;
import java.io.IOException;
import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.http.HttpMethod;
import org.springframework.stereotype.Component;
import org.springframework.web.filter.OncePerRequestFilter;
@Component
public class IdempotencyHeaderFilter extends OncePerRequestFilter {
public static final String ATTR = IdempotencyHeaderFilter.class.getName() + ".value";
@Override
protected boolean shouldNotFilter(HttpServletRequest request) {
String method = request.getMethod();
return !(HttpMethod.POST.matches(method) || HttpMethod.PATCH.matches(method));
}
@Override
protected void doFilterInternal(
HttpServletRequest request,
HttpServletResponse response,
FilterChain filterChain) throws ServletException, IOException {
String value = request.getHeader("Idempotency-Key");
if (value == null || value.isBlank()) {
response.sendError(HttpServletResponse.SC_BAD_REQUEST, "Missing Idempotency-Key");
return;
}
request.setAttribute(ATTR, value);
filterChain.doFilter(request, response);
}
}That small handoff pays off later. We validate the header once, then let downstream code read a trusted request attribute instead of repeating the same check in the controller, service, and repository layers. If we also need a request fingerprint from the body, this early stage is where we usually wrap the request so the body can still be read later by Spring MVC after fingerprint logic touches it.
The filter can reject a missing Idempotency-Key before the request reaches controller code. If the API needs the same ProblemDetail body for that early rejection, the filter should write that response directly or delegate to Spring’s exception handling instead of relying on controller advice alone.
Writing the First Claim
Before the business insert, update, or outside side effect runs, we need to claim the request identity in durable storage. That claim is what stops two closely timed copies of the same request from both moving forward as if they were first. In practice, we store caller scope, endpoint scope, Idempotency-Key, payload fingerprint, processing state, saved status, saved response, and expiration time in the same row. A relational database fits naturally here because idempotency for write endpoints is tied directly to write consistency.
This entity and repository keep that replay record in one place:
package com.example.idempotency;
import java.time.Instant;
import java.util.Optional;
import jakarta.persistence.Column;
import jakarta.persistence.Entity;
import jakarta.persistence.EnumType;
import jakarta.persistence.Enumerated;
import jakarta.persistence.GeneratedValue;
import jakarta.persistence.GenerationType;
import jakarta.persistence.Id;
import jakarta.persistence.Lob;
import jakarta.persistence.Table;
import jakarta.persistence.UniqueConstraint;
import org.springframework.data.jpa.repository.JpaRepository;
@Entity
@Table(
name = "api_idempotency",
uniqueConstraints = @UniqueConstraint(
columnNames = {"client_id", "operation_name", "idempotency_key"}))
public class IdempotencyRecord {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Column(name = "client_id", nullable = false)
private String clientId;
@Column(name = "operation_name", nullable = false)
private String operationName;
@Column(name = "idempotency_key", nullable = false, length = 100)
private String idempotencyKey;
@Column(nullable = false, length = 71)
private String fingerprint;
@Enumerated(EnumType.STRING)
@Column(nullable = false, length = 16)
private RecordState state;
@Column(name = "response_status")
private Integer responseStatus;
@Lob
@Column(name = "response_body")
private String responseBody;
@Column(name = "expires_at", nullable = false)
private Instant expiresAt;
protected IdempotencyRecord() {
}
public static IdempotencyRecord pending(
String clientId,
String operationName,
String idempotencyKey,
String fingerprint,
Instant expiresAt) {
IdempotencyRecord record = new IdempotencyRecord();
record.clientId = clientId;
record.operationName = operationName;
record.idempotencyKey = idempotencyKey;
record.fingerprint = fingerprint;
record.state = RecordState.PENDING;
record.expiresAt = expiresAt;
return record;
}
public Long getId() {
return id;
}
public String getFingerprint() {
return fingerprint;
}
public RecordState getState() {
return state;
}
public Integer getResponseStatus() {
return responseStatus;
}
public String getResponseBody() {
return responseBody;
}
public void markCompleted(int responseStatus, String responseBody) {
this.state = RecordState.COMPLETED;
this.responseStatus = responseStatus;
this.responseBody = responseBody;
}
}
enum RecordState {
PENDING,
COMPLETED
}
interface IdempotencyRecordRepository extends JpaRepository<IdempotencyRecord, Long> {
Optional<IdempotencyRecord> findByClientIdAndOperationNameAndIdempotencyKey(
String clientId,
String operationName,
String idempotencyKey);
void deleteByStateAndExpiresAtBefore(RecordState state, Instant expiresAt);
}The uniqueness rule is what breaks the race because if two copies of the same request arrive within a tiny timing gap, only the first insert should claim that row. The second thread can then reload the stored record and inspect its fingerprint and state instead of running the write a second time. Without that claim row, both threads could pass header validation and still create duplicate state.
We usually call this claim helper before the guarded business mutation runs. The claim insert is kept separate here so a failed unique insert can roll back before the code reloads the existing idempotency record.
This service shows that first claim step:
package com.example.idempotency;
import java.time.Duration;
import java.time.Instant;
import org.springframework.dao.DataIntegrityViolationException;
import org.springframework.stereotype.Service;
@Service
public class IdempotencyClaimService {
private final IdempotencyRecordRepository repository;
public IdempotencyClaimService(IdempotencyRecordRepository repository) {
this.repository = repository;
}
public IdempotencyRecord claim(
String clientId,
String operationName,
String idempotencyKey,
String fingerprint) {
try {
Instant expiresAt = Instant.now().plus(Duration.ofHours(24));
return repository.saveAndFlush(IdempotencyRecord.pending(
clientId, operationName, idempotencyKey, fingerprint, expiresAt));
} catch (DataIntegrityViolationException ex) {
return repository.findByClientIdAndOperationNameAndIdempotencyKey(
clientId, operationName, idempotencyKey).orElseThrow();
}
}
}From there, the replay row becomes the gate in front of the write. The thread that claims the row first continues through the mutation. Later arrivals with that same scoped request identity have to read the stored row and react to it instead of trying the mutation again.
Returning the Saved Result
After the claim step, the request flow narrows into a small set of branches. Missing header data maps to a client error. Matching fingerprint with a PENDING row means the first request is still in flight. Matching fingerprint with a COMPLETED row means we replay the stored result. The same Idempotency-Key value with a different fingerprint means the caller reused that request identity for different input and should get a rejection. Keeping those branches explicit makes the endpoint behavior predictable when retries arrive under pressure.
Spring MVC’s built-in problem response types fit this part of the flow well. We can centralize idempotency failures in one advice class instead of building those responses inline across multiple controllers.
This advice class keeps those responses in one place:
package com.example.idempotency;
import org.springframework.http.HttpStatus;
import org.springframework.http.ProblemDetail;
import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.RestControllerAdvice;
@RestControllerAdvice
public class IdempotencyErrorAdvice {
@ExceptionHandler(MissingIdempotencyHeaderException.class)
ProblemDetail missingHeader(MissingIdempotencyHeaderException ex) {
return ProblemDetail.forStatusAndDetail(
HttpStatus.BAD_REQUEST,
"Idempotency-Key is required for this endpoint.");
}
@ExceptionHandler(IdempotencyRequestInFlightException.class)
ProblemDetail requestInFlight(IdempotencyRequestInFlightException ex) {
return ProblemDetail.forStatusAndDetail(
HttpStatus.CONFLICT,
"A matching request is still being processed.");
}
@ExceptionHandler(IdempotencyFingerprintMismatchException.class)
ProblemDetail fingerprintMismatch(IdempotencyFingerprintMismatchException ex) {
return ProblemDetail.forStatusAndDetail(
HttpStatus.UNPROCESSABLE_CONTENT,
"The idempotency value was reused with a different payload.");
}
}
class MissingIdempotencyHeaderException extends RuntimeException {
}
class IdempotencyRequestInFlightException extends RuntimeException {
}
class IdempotencyFingerprintMismatchException extends RuntimeException {
}Stored status and response content are part of replay, not side details. If the first request returned 201 Created with a body that contains the new resource identifier, then a matching retry should receive that same business outcome rather than a freshly invented response. That is why we usually save both the status and the response body, or at least save a stable reference that lets us rebuild that outward result without applying the write again.
Clearing Old Records
Replay rows only help inside the retention window we publish. After that window closes, expired rows stop contributing to replay protection and start turning into historical baggage. Spring’s scheduling support gives us a built-in way to retire them with @Scheduled after scheduling is turned on with @EnableScheduling in a configuration class or the Spring Boot application class. That fits well for background cleanup of expired entries.
Let’s see a cleanup job that removes completed rows whose expiration time has passed:
package com.example.idempotency;
import java.time.Instant;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
import org.springframework.transaction.annotation.Transactional;
@Component
public class IdempotencyCleanupJob {
private final IdempotencyRecordRepository repository;
public IdempotencyCleanupJob(IdempotencyRecordRepository repository) {
this.repository = repository;
}
@Transactional
@Scheduled(fixedDelay = 300_000)
public void purgeExpiredRecords() {
repository.deleteByStateAndExpiresAtBefore(
RecordState.COMPLETED,
Instant.now());
}
}Completed rows are the safe first target for cleanup. PENDING rows need more care, because deleting them too early can reopen the door to duplicate writes while the original request is still active or while the API is recovering from a partial failure. If the business mutation and the idempotency row live in the same database, the completed response update should be written as part of the same local persistence flow as the guarded write. That keeps the replay record tied to the result it is meant to return.
Downstream calls still deserve their own thinking. If the endpoint also sends a payment request to an outside provider or publishes a message after the local database write, that outside hop does not automatically inherit the same replay guard just because the HTTP boundary is protected. What we build here protects the API boundary itself. Any later network hop that can duplicate its side effect still needs its own deduplication rule.
Conclusion
Idempotency in a Spring Boot API comes down to placing the checks in the right order. We read the Idempotency-Key before the write, claim that request identity in storage, compare the payload fingerprint on retries, replay the saved response after completion, and retire expired rows after the replay window closes. That flow keeps duplicate submits from turning into duplicate writes and gives the API a stable way to handle retries, refreshes, and lost responses.


