Cleaning Up Old Files with Scheduled Tasks in Spring Boot
Automatically remove stale files without manual work
Applications that handle uploads, logs, or temporary files tend to collect old data that no one touches for a while. If left alone, that can slowly eat up disk space and create problems you don’t want to deal with later. Setting up background tasks in Spring Boot to take care of old files helps keep things from piling up, and you don’t have to babysit the process once it’s running.
How Scheduled Tasks Work in Spring Boot
Spring Boot makes it possible to run background logic without any manual trigger. Once the scheduler is active, it quietly runs your logic in the background based on a time pattern you define. The benefit is that this happens without user input or request traffic. It just ticks away on its own thread, watching the clock and calling the method when it’s time. This is built on top of the scheduling support from the Spring Framework, which uses annotations and a task scheduler in the background.
You won’t need to do much to get started. Scheduling in Spring Boot becomes available the moment you annotate a configuration class with @EnableScheduling
. That tells the framework to scan for methods marked with @Scheduled
and set up the scheduler in the background. From there, it’s just a matter of writing a method with logic you want to run, giving it the right time expression, and letting Spring take care of calling it.
@Configuration
@EnableScheduling
public class SchedulerConfig {
}
@Component
public class CleanupWorker {
@Scheduled(fixedRate = 60000)
public void runPeriodicCleanup() {
System.out.println("Running cleanup check at " + LocalDateTime.now());
// logic
}
}
The fixedRate
option in @Scheduled
schedules the method every 60 seconds, measured from the start of the previous run. When only one scheduler thread is available, the next execution waits until the prior run finishes, but the planned start times are still every 60 seconds. If the task finishes before its 60-second mark, the next call begins right on that boundary. If it overruns, the scheduler fires immediately after the earlier run ends so it can catch up. If the task takes too long, though, it will stack up. The scheduler keeps its own delay queue; if the period elapses while the previous run is still active on a single-thread scheduler, the next execution is queued and fires as soon as the current one. You can also use cron
expressions for more control. That lets you schedule something like “every night at 3:30 AM” without using fixed intervals.
If you prefer that the delay starts after the previous run finishes, use fixedDelay
instead. That adds a gap between the end of one run and the beginning of the next.
@Scheduled(fixedDelay = 120000)
public void cleanWithDelay() {
System.out.println("Delayed cleanup run at " + LocalDateTime.now());
}
That kind of timing gives you better pacing when the task varies in how long it takes, especially if you don’t want it overlapping.
You can also hold off on running a task right after startup by setting a delay (initialDelay
). That helps if other parts of the app need a moment to warm up first.
@Scheduled(initialDelay = 30000, fixedDelay = 120000)
public void delayFirstRun() {
// wait 30 seconds after startup, then run every 2 minutes
}
This gives the rest of the system time to get going before background jobs begin.
Making a Custom Executor for Cleanup Tasks
By default, Spring uses a single thread for all scheduled methods. That thread runs every scheduled task in the order it finds them. If one method runs long, it can delay the next. That’s not always a problem, but if you start stacking multiple jobs like one for log rotation and another for file expiration, you may not want them taking turns on the same thread.
To change that, you can define your own scheduler with a thread pool. That gives you control over how many tasks can run at once and what their thread names look like in logs. It also helps you isolate long-running jobs from smaller ones that finish quickly.
Here’s how you set up a custom scheduler:
@Bean(destroyMethod = "close")
public TaskScheduler taskScheduler(SimpleAsyncTaskSchedulerBuilder builder) {
return builder
.threadNamePrefix("bg-task-")
.virtualThreads(true)
.build();
}
After this bean is active, Spring switches to your scheduler for handling all @Scheduled
methods. Each task still runs on its schedule, but now multiple tasks can run in parallel if needed. Each thread in the pool will carry the name prefix you set. That makes it easier to spot these threads in logs or thread dumps if you’re ever debugging timing issues.
This makes it much safer to add more background jobs later, without worrying about them stepping on each other or blocking important tasks. And if anything goes wrong, your cleanup thread can fail quietly without crashing the main app.
From Spring Boot 3.2 onward you can set spring.threads.virtual.enabled=true
(while running on JDK 21+) and Boot will auto-configure a SimpleAsyncTaskScheduler
that already uses virtual threads.
Filtering Based on Last Modified Or Access Time
Java gives you a few different ways to check how old a file is. One of the most common is last modified time, which tells you when a file was last changed. There’s also access time, which tells you the last time a file was read. Not all systems track access time by default, but it’s still available on many setups where the filesystem supports it.
To work with these timestamps, you can use the BasicFileAttributes
class from the java.nio.file.attribute
package. It lets you read metadata for each file without having to load or open it.
Here’s how you could scan through a folder and delete anything older than a given number of days:
public class FileCleaner {
public void removeExpiredFiles(String basePath, long daysOld) throws IOException {
Instant cutoff = Instant.now().minus(Duration.ofDays(daysOld));
Path root = Paths.get(basePath);
try (Stream<Path> files = Files.walk(root)) {
files.filter(Files::isRegularFile)
.filter(path -> isStale(path, cutoff))
.forEach(this::deleteSafely);
}
}
private boolean isStale(Path file, Instant threshold) {
try {
BasicFileAttributes attrs = Files.readAttributes(file, BasicFileAttributes.class);
return attrs.lastModifiedTime().toInstant().isBefore(threshold);
} catch (IOException e) {
return false;
}
}
private void deleteSafely(Path file) {
try {
Files.deleteIfExists(file);
} catch (IOException e) {
// log and move on
}
}
}
This runs through the directory tree and looks for regular files. If a file was modified before the cutoff time, it gets passed to the delete method. If reading the attributes fails for any reason, the method skips that file. That avoids throwing errors in the middle of the run.
If you want to rely on access time instead of modified time, you can call lastAccessTime()
instead of lastModifiedTime()
. Just keep in mind that not all filesystems update access time reliably, and on some platforms it’s disabled by default to improve disk performance. You’ll want to test this before assuming it works across the board.
You can also exclude files based on extensions or file size, depending on how strict the cleanup rules need to be. That part is easy to add with another filter step before calling the delete function.
Security Considerations When Cleaning Files
Any task that deletes files should be treated with a safety net. Mistakes in the logic or bugs in directory scanning can have real consequences. So before you start deleting anything, it helps to put in a few guardrails.
First, lock down the path your cleanup job will touch. The base path should be fixed in configuration and never come from outside input. If you accept path changes through a web endpoint or a dynamic value, that opens the door to dangerous behavior like deleting files outside your target folder. To make sure your job stays inside the intended folder, resolve each file path to its real location and check that it still falls within the base directory. This keeps symbolic links or relative paths from escaping your cleanup zone.
Here’s how you can do that:
private static final Path ROOT_PATH = Paths.get("/data/app/uploads");
private boolean isSafe(Path file) {
try {
Path resolved = file.toRealPath();
return resolved.startsWith(ROOT_PATH);
} catch (IOException e) {
return false;
}
}
You’d call this before deleting any file, and only move forward if the result is true. That’s one of the easiest ways to stop deletion from drifting into unintended areas.
You should also avoid following symbolic links if possible. By default, Files.walk
does not follow directory symbolic links. It just returns the link itself. If you want to traverse through symbolic links, you have to opt in with FileVisitOption.FOLLOW_LINKS
.
try (Stream<Path> files = Files.walk(root, FileVisitOption.FOLLOW_LINKS)) {
// logic
}
If you know you don’t need to support them, it’s safer to ignore them altogether. Don’t forget to log each file you delete. That log becomes the trail you can follow if something goes wrong. If a file vanishes that shouldn’t have, you’ll want to know what the cleanup process did and when.
Another thing to avoid is running the cleanup process with elevated system permissions. If your app has access to delete outside the expected folder, and your filters don’t work properly, a small bug could have a large reach. Keep the file system permissions narrow and scoped to just the folder your task is meant to clean.
Separating Authentication From Background Cleanup
Scheduled tasks don’t run in the context of a logged-in user. They don’t carry a session, a token, or any of the usual things that would tie an action back to a person. That makes them good for background work, but you have to be careful with how authentication is handled if your cleanup logic ever connects to other services.
For example, if the cleanup task sends a message to another service to report that a file was removed, that service call should use a fixed API key or a machine credential. These are usually scoped to a single purpose and don’t represent a user. That’s safer and keeps your cleanup task separate from user logic. You should never pass a user’s token or login context into a background job like this. That kind of shortcut can lead to confused audit logs or security gaps where automation ends up with access to things it shouldn’t. Background jobs don’t need user-level access, and tying them to user accounts usually leads to problems over time.
If your application does both system-level tasks and user-triggered ones, make sure you separate how each one authenticates. Scheduled jobs should act as the system itself, not as a user. API calls made by those jobs should carry headers or tokens that reflect their system origin. Keep the authentication short-lived if possible, and don’t store long-term credentials in the code. Put them in secure configuration like environment variables or an encrypted secrets store.
The less your cleanup logic knows about user data or identity, the safer it will be. Treat it as its own thread of work that lives outside the user lifecycle and has no awareness of sessions, accounts, or permissions that belong to a person. That separation helps keep your background task focused and contained.
Conclusion
The real work here happens in the background, quietly tied to time and system threads that don’t rely on anyone clicking a button. Spring gives you a way to schedule that logic, isolate it from your main flow, and keep it consistent. File filters make decisions based on real timestamps, not guesswork. The scheduler calls the method when it should, and your code takes care of the rest. With the right checks in place and a narrow scope, this kind of cleanup stays predictable and safe.
