Async Request Processing in Spring Boot with CompletableFuture

May 27, 2026

For a Spring Boot endpoint, asynchronous request processing lets the original servlet thread step away while slow downstream I/O continues on a different executor. That pays off most in aggregate endpoints that call several services, wait for every reply, and then return one response to the client. On the current Spring MVC stack, the servlet side still supports asynchronous request handling, accepts CompletionStage as a controller return type, and resumes the request after the result is ready. CompletableFuture fits nicely here because it brings thread handoff, stage composition, timeout handling, and result joining into one Java API.

How the Request Moves

When it comes to working with a request flow, we want to track what happens from the moment Spring MVC receives the HTTP call to the moment the final response body goes out. A controller still begins on the servlet thread that accepted the request, but Spring MVC can keep the response open, release that original thread, and resume processing later after an async value completes. Spring MVC accepts CompletionStage as a controller return type, which is why a controller can return CompletableFuture<T> and still finish as a normal MVC response after the async result is ready.

Servlet Thread Handoff

Processing begins on the incoming request thread just like any other Spring MVC endpoint. The change appears when the controller returns a future-style value instead of the final body itself. At that point, Spring marks the request for asynchronous handling, the servlet thread exits, and response completion pauses until a later dispatch picks the request back up. We can think about that flow in three moments. First, the request enters on the original servlet thread. Next, some other thread finishes the async result. After that, Spring dispatches the request back through MVC so response writing can finish in the normal way.

That handoff does not turn blocking I/O into non-blocking I/O by itself. If a downstream HTTP call, database call, or SDK call blocks, the worker thread still waits during that call. What changes is where that wait happens. Instead of leaving the servlet thread parked on that downstream delay, we let the servlet container return that thread to incoming traffic while the async branch continues elsewhere. For aggregate endpoints, that is the first practical reason this model helps. We spend less servlet-thread time waiting on remote calls and more of that time accepting fresh requests.

We can make the handoff visible with a small controller:

import java.util.Map;
import java.util.concurrent.CompletableFuture;

import org.springframework.core.task.AsyncTaskExecutor;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class ThreadTraceController {

    private final AsyncTaskExecutor taskExecutor;

    public ThreadTraceController(AsyncTaskExecutor taskExecutor) {
        this.taskExecutor = taskExecutor;
    }

    @GetMapping("/thread-trace")
    public CompletableFuture<Map<String, String>> trace() {
        String requestThread = Thread.currentThread().getName();

        return CompletableFuture.supplyAsync(() -> Map.of(
                "requestThread", requestThread,
                "resultThread", Thread.currentThread().getName()
        ), taskExecutor);
    }
}

If we call this endpoint and inspect the response, the two thread names should usually differ in a pooled executor setup. The method enters on the servlet thread, while the CompletableFuture stage runs on a worker thread. After the future completes, Spring dispatches the request back through MVC so message conversion and response writing can finish. That resumed dispatch is part of the same HTTP request, not a second request.

That detail helps explain behavior that can look odd the first time we trace it. Logging, filters, and interceptors may appear around more than one phase of the same request. Nothing duplicated. Spring paused the request, let async processing continue away from the servlet thread, and then resumed request handling later. Keeping that sequence in view makes the rest of the CompletableFuture flow much easier to follow.

We can also make the resumed dispatch visible in a filter:

import java.io.IOException;

import jakarta.servlet.DispatcherType;
import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;

import org.springframework.stereotype.Component;
import org.springframework.web.filter.OncePerRequestFilter;

@Component
public class AsyncTraceFilter extends OncePerRequestFilter {

    @Override
    protected void doFilterInternal(
            HttpServletRequest request,
            HttpServletResponse response,
            FilterChain filterChain) throws ServletException, IOException {

        System.out.println(
                "dispatcher=" + request.getDispatcherType() +
                ", thread=" + Thread.currentThread().getName()
        );

        filterChain.doFilter(request, response);
    }

    @Override
    protected boolean shouldNotFilterAsyncDispatch() {
        return false;
    }
}

With shouldNotFilterAsyncDispatch() returning false, the filter can run for the resumed async dispatch too. That gives us a practical way to see that Spring did not finish everything in a single straight pass on the original request thread. Instead, the request paused, the result completed elsewhere, and MVC resumed afterward.

Joining Futures for One Response

Most aggregate endpoints fan out into more than one downstream call and then merge those replies into one payload. That is where CompletableFuture becomes useful. It is both a Future and a CompletionStage, so we can treat it as a value that arrives later and also attach follow-up stages after normal completion, after failure, or after several other stages finish.

With two independent branches, thenCombine gives us a natural merge point. We wait for both stages to complete normally, then build the merged value from both results in one place:

CompletableFuture<CustomerView> customerFuture = customerClient.fetchCustomer(customerId);
CompletableFuture<LoyaltyView> loyaltyFuture = loyaltyClient.fetchLoyalty(customerId);

CompletableFuture<CustomerCard> cardFuture =
        customerFuture.thenCombine(
                loyaltyFuture,
                (customer, loyalty) -> new CustomerCard(customer, loyalty)
        );

That reads in the same order we think about the endpoint. We fetch customer data, fetch loyalty data, and then build the combined response after both values arrive. No manual polling shows up in the code, and no shared counter is needed either. The dependency lives in the future chain itself, so the merge step stays near the data it depends on.

As the endpoint grows past two branches, CompletableFuture.allOf usually gives us a better group wait. allOf does not hold the combined data for us. It returns CompletableFuture<Void>, which means its job is to signal that all input futures have finished. After that point, we can read each completed value and build the final response:

CompletableFuture<ProfileView> profileFuture = profileClient.fetchProfile(userId);
CompletableFuture<OrderSummary> ordersFuture = orderClient.fetchRecentOrders(userId);
CompletableFuture<AlertSummary> alertsFuture = alertClient.fetchAlerts(userId);

CompletableFuture<Void> allDone =
        CompletableFuture.allOf(profileFuture, ordersFuture, alertsFuture);

CompletableFuture<DashboardView> dashboardFuture =
        allDone.thenApply(ignored -> new DashboardView(
                profileFuture.join(),
                ordersFuture.join(),
                alertsFuture.join()
        ));

join() fits well in that spot. By the time the thenApply block runs, allDone has already finished, so we are reading results from futures that already reached completion. We are not quietly turning the flow back into a serial chain of waits. Total time still stays close to the slowest branch rather than drifting toward the sum of every branch.

Failure handling also becomes easier to reason through after the join point is visible. If any input future completes exceptionally and we do not recover that branch earlier, the combined stage will also complete exceptionally. That lets us decide branch by branch what the endpoint requires and what it can recover from. Required data can still fail the whole response, while optional data can be converted to fallback values before we reach the final merge.

We can see that branch-level recovery in a short example:

CompletableFuture<ProfileView> profileFuture = profileClient.fetchProfile(userId);

CompletableFuture<AlertSummary> alertsFuture =
        alertClient.fetchAlerts(userId)
                .exceptionally(ex -> AlertSummary.unavailable());

CompletableFuture<AccountPage> accountPageFuture =
        profileFuture.thenCombine(
                alertsFuture,
                (profile, alerts) -> new AccountPage(profile, alerts)
        );

In that flow, profile data still has to arrive normally, but alerts can fall back to a replacement object if that call fails. That keeps the recovery rule close to the branch it belongs to, which makes the final merge step easier to read. After we reach the controller response, the endpoint is dealing with a single combined future rather than a scattered set of unfinished pieces.

Building the Endpoint

Once we move past request flow, the next step is to decide where async branches run, how a service hands a future back to MVC, and where the final response comes back into a single object. For this section, we want to stay close to those moving parts. Spring Boot gives us MVC async handling and @Async, while Java gives us CompletableFuture stage methods for waiting on several downstream calls before we return one response body.

Picking the Executor

Thread handoff only helps if we send that handoff to the right place. For MVC async requests, the executor is not a side detail. It controls where deferred controller branches run, which thread names appear in logs, how much concurrency the pool can absorb, and how queueing behaves when traffic rises.

Spring Boot also has a specific expectation for this. MVC async request handling and @EnableAsync can use Boot’s auto-configured AsyncTaskExecutor, and if we provide our own executor bean, Boot backs off from its default. For MVC, the custom bean that Spring Boot looks for is named applicationTaskExecutor, and it needs to be an AsyncTaskExecutor. That naming point is easy to miss, but it decides which executor MVC async processing will actually pick up.

Type matters too. Spring’s thread execution support includes SimpleAsyncTaskExecutor, which starts a new thread for each submitted unit of execution, and ThreadPoolTaskExecutor, which wraps a reusable pool. For HTTP request traffic, a pooled executor is usually the better match because request-driven async branches keep arriving over the life of the application. Reusing threads, bounding the queue, and giving the pool a visible name prefix all help when we need to read logs or inspect thread dumps.

MVC async handling has its own timeout setting as well. That timeout begins after the main request thread exits and runs until the request comes back through the resumed dispatch. Putting the executor and the MVC timeout in the same configuration keeps those two decisions close to each other, which makes the overall request behavior much easier to reason through later.

We can set both in one place:

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.task.AsyncTaskExecutor;
import org.springframework.scheduling.annotation.EnableAsync;
import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor;
import org.springframework.web.servlet.config.annotation.AsyncSupportConfigurer;
import org.springframework.web.servlet.config.annotation.WebMvcConfigurer;

@Configuration(proxyBeanMethods = false)
@EnableAsync
public class AsyncConfig {

    @Bean("applicationTaskExecutor")
    AsyncTaskExecutor applicationTaskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(8);
        executor.setMaxPoolSize(16);
        executor.setQueueCapacity(100);
        executor.setThreadNamePrefix("mvc-async-");
        executor.initialize();
        return executor;
    }

    @Bean
    WebMvcConfigurer webMvcConfigurer(AsyncTaskExecutor applicationTaskExecutor) {
        return new WebMvcConfigurer() {
            @Override
            public void configureAsyncSupport(AsyncSupportConfigurer configurer) {
                configurer.setTaskExecutor(applicationTaskExecutor);
                configurer.setDefaultTimeout(1500);
            }
        };
    }
}

That configuration gives MVC async processing a named pooled executor and also sets the request timeout for servlet-side async handling. The thread name prefix pays off quickly when we trace a request across the controller entry point, the async branch, and the resumed response dispatch.

Service Methods

Controllers read better when they stay focused on HTTP input and output while service classes own downstream calls. That separation also makes async behavior more readable. The controller can return a future to MVC, and the service can focus on fetching data and wrapping the result in the future type that the controller expects.

Spring’s @Async support allows a method to return CompletableFuture, which fits well here because we can pass the future upstream and still attach follow-up stages later. There is a small but important detail in that contract. The proxied @Async method returns the actual asynchronous handle to the caller, while the method body still returns a value that matches its declared signature. That is why an @Async service method commonly returns CompletableFuture.completedFuture(result) after the downstream call finishes, or CompletableFuture.failedFuture(ex) if that call throws.

We can write a client service like this:

import java.util.concurrent.CompletableFuture;

import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;
import org.springframework.web.client.RestClient;

@Service
public class PricingClient {

    private final RestClient restClient;

    public PricingClient(RestClient.Builder builder) {
        this.restClient = builder.baseUrl("http://pricing-service").build();
    }

    @Async("applicationTaskExecutor")
    public CompletableFuture<PricingView> fetchPricing(long skuId) {
        try {
            PricingView body = restClient.get()
                    .uri("/pricing/{skuId}", skuId)
                    .retrieve()
                    .body(PricingView.class);

            return CompletableFuture.completedFuture(body);
        }
        catch (Exception ex) {
            return CompletableFuture.failedFuture(ex);
        }
    }
}

That method reads like a normal service call, but the proxy around it moves execution onto the named executor before the controller ever sees the result. The return type stays in the Java concurrency model the whole way, which keeps later composition steps natural.

Method boundaries matter here too. Spring’s default async support is proxy-based, so a direct call from one method to another method in the same class does not pass through the async proxy. If a bean calls its own @Async method directly, the annotation is skipped and the call runs on the current thread. Keeping async methods in a separate bean avoids that mismatch between what the code says and what actually happens at runtime.

Java’s CompletableFuture factory methods bring a related choice into view. If we call supplyAsync or runAsync without passing an executor, Java falls back to the common pool. In a Spring MVC application, that is usually not the thread pool we want for request-driven branches. Passing the application executor keeps thread ownership explicit and keeps request-related async flow on the pool we chose for the application.

Aggregate Endpoints

Aggregate endpoints are where CompletableFuture starts to pay off. We fan out to several downstream calls, put timeout rules close to each branch, and then merge the completed values into one response object. Total response time can then move closer to the slowest branch rather than drifting toward the sum of every branch.

Java gives us two timeout methods that are similar in appearance but lead to different results. orTimeout completes the future exceptionally with TimeoutException, while completeOnTimeout completes normally with a fallback value. That lets us make a branch required or optional right where that branch is defined.

We can put those branch rules directly into the service:

import java.util.concurrent.CompletableFuture;
import java.util.concurrent.TimeUnit;

import org.springframework.stereotype.Service;

@Service
public class ItemPageService {

    private final ItemClient itemClient;
    private final StockClient stockClient;
    private final RatingClient ratingClient;

    public ItemPageService(
            ItemClient itemClient,
            StockClient stockClient,
            RatingClient ratingClient) {
        this.itemClient = itemClient;
        this.stockClient = stockClient;
        this.ratingClient = ratingClient;
    }

    public CompletableFuture<ItemPageResponse> loadPage(long itemId) {
        CompletableFuture<ItemView> item =
                itemClient.fetchItem(itemId)
                        .orTimeout(800, TimeUnit.MILLISECONDS);

        CompletableFuture<StockView> stock =
                stockClient.fetchStock(itemId)
                        .completeOnTimeout(StockView.unknown(), 250, TimeUnit.MILLISECONDS);

        CompletableFuture<RatingView> ratings =
                ratingClient.fetchRatings(itemId)
                        .completeOnTimeout(RatingView.unavailable(), 250, TimeUnit.MILLISECONDS);

        return CompletableFuture.allOf(item, stock, ratings)
                .thenApply(ignored -> new ItemPageResponse(
                        item.join(),
                        stock.join(),
                        ratings.join()
                ));
    }
}

There are a couple of mechanics worth reading closely in that method. CompletableFuture.allOf() does not return the merged data. It returns CompletableFuture<Void>, which means it acts as the group wait for all input futures. After that stage completes, we can safely read each branch value and build the final response object.

join() fits that spot well after the group wait completes normally. By the time the thenApply() block runs, all three input futures have completed normally or have recovered to fallback values. We are reading finished results, not turning the flow back into a serial chain of waits. If a required branch completed exceptionally, CompletableFuture.allOf() completes exceptionally too, so this thenApply() block does not run. The returned future carries that failure forward through the normal CompletableFuture exception flow. If an optional branch already recovered through completeOnTimeout(), its fallback value flows into the response just like any other completed value.

The controller can stay very small:

import java.util.concurrent.CompletableFuture;

import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/items")
public class ItemPageController {

    private final ItemPageService itemPageService;

    public ItemPageController(ItemPageService itemPageService) {
        this.itemPageService = itemPageService;
    }

    @GetMapping("/{id}/page")
    public CompletableFuture<ResponseEntity<ItemPageResponse>> page(@PathVariable long id) {
        return itemPageService.loadPage(id)
                .thenApply(ResponseEntity::ok);
    }
}

That layout keeps branch timeouts, fallback rules, and result merging inside the service, while the controller simply hands the future back to MVC. Reading the code from top to bottom, we can see which downstream data must arrive, which data can fall back, and where the final response object is built.

Other Choices

CompletableFuture is not the only async return style in Spring MVC. It fits well when we want one eventual value, Java-stage composition, and servlet-based request handling, but Spring MVC has a few other options that fit different completion flows.

Callable is the oldest and most direct option when the controller itself owns the deferred computation and wants Spring MVC to run that unit on an async executor. WebAsyncTask wraps that same idea and adds per-request timeout and executor settings around the Callable. Those types stay very close to MVC itself, so they can be a good match when the controller body owns the whole deferred branch and there is no wider future chain to compose.

DeferredResult fits a different case. Instead of returning a future that runs inside the controller’s own flow, we return a holder and complete it later from some other thread. That maps well to events that come from outside the controller call stack, such as a message listener or scheduled callback. In that model, the controller hands back the pending result container, and some later part of the application fills it in.

Mono belongs to Reactor’s reactive model and represents a publisher that emits at most one item or one error. Spring MVC can adapt that single-value reactive type, but MVC still runs on the servlet stack. If the application is still centered on blocking clients and servlet request handling, CompletableFuture stays a natural fit for fan-out endpoints that merge into one response. If the application moves toward a full reactive stack with non-blocking I/O throughout, Mono and WebFlux reach farther in that direction.

As a group, these options differ mainly in where completion happens and how far the async model reaches through the request flow. For a servlet endpoint that fans out to a few downstream calls and then joins those replies into one payload, CompletableFuture keeps the merge logic close to the service layer while still fitting neatly into Spring MVC’s async request handling.

Conclusion

With CompletableFuture, we can follow the full request from servlet-thread handoff to the final merged response without breaking the flow into unrelated pieces. Spring MVC releases the original request thread, downstream calls move onto an executor, timeout rules stay attached to the futures they belong to, and thenCombine() or CompletableFuture.allOf() brings the finished values back into one response. That lets an aggregate endpoint wait on several calls at the same time instead of stacking them in order, which can lower total response time and keep the endpoint logic easier to read.

Spring MVC Asynchronous Requests
Spring Boot Task Execution and Scheduling
Spring Framework @Async Javadoc
Java CompletableFuture Javadoc
Spring Framework AsyncSupportConfigurer Javadoc
Project Reactor Mono Javadoc

Share Alexander Obregon's Substack

Alexander Obregon's Substack

Discussion about this post

Ready for more?