Mo's Algorithm for Range Queries in Java

May 29, 2026

Range query problems can get expensive when every request scans its own slice of the array from scratch. Mo’s Algorithm handles that by processing all queries offline, then sorting them so nearby ranges are answered near one another. The algorithm keeps one active range with a left edge, a right edge, and saved state for the values currently inside that range. As the sorted queries move forward, indexes that enter the range are added, and indexes that leave are removed. For a problem like counting distinct values in a subarray, that means we update only the changed positions instead of rebuilding the count for the full range every time. The array stays fixed the whole time, so this method fits static range queries rather than problems where values keep changing.

Why Mo’s Algorithm Reorders Range Queries

Query order controls how far the active range has to travel between answers. If requests stay in input order, two neighboring entries in the query list can point to very different parts of the array, so the left and right boundaries may jump across large distances. Reordering the requests gives those boundaries a more controlled route. Ranges with nearby left endpoints are grouped first, then their right endpoints are arranged so the active window does less backtracking. That is why the algorithm spends less time repeating array updates while still placing every answer back into the query’s original position.

The Cost of Direct Scanning

Direct scanning starts with the most natural idea. For every query, we read from left through right and compute the answer for that range only. For a distinct-count query, we can store every value we see in a HashSet, then return the size of that set after the loop finishes.

The helper below answers one inclusive range by scanning the requested indexes:

import java.util.HashSet;
import java.util.Set;

public static int countDistinctByScanning(int[] values, int left, int right) {
    Set<Integer> seen = new HashSet<>();

    for (int i = left; i <= right; i++) {
        seen.add(values[i]);
    }

    return seen.size();
}

We can read the method from top to bottom. The HashSet stores only unique values. The loop walks every index in the requested range, and seen.add(values[i]) records the current value without adding duplicates. After the loop, seen.size() gives the number of distinct values in that range.

For this input:

int[] values = {1, 2, 1, 3, 2, 4, 1};

That query [1, 4] reads 2, 1, 3, and 2. The distinct values are 1, 2, and 3, so the answer is 3.

The issue appears when we repeat the scan for every query. The range [1, 4] reads four values. The range [2, 5] also reads four values, and most of those indexes overlap with the earlier range. We still create fresh counting state and read the requested slice again, even though part of the needed information was already seen.

The full direct version can call the helper for every query:

public static int[] answerQueriesByScanning(int[] values, int[][] queries) {
    int[] answers = new int[queries.length];

    for (int i = 0; i < queries.length; i++) {
        int left = queries[i][0];
        int right = queries[i][1];

        answers[i] = countDistinctByScanning(values, left, right);
    }

    return answers;
}

We store one answer for each query, then read the left and right endpoints from the current row of the query array. After that, we call the range helper and put the result into the answer slot that matches the original query position.

This version is correct, but the time cost can grow quickly. If the array has n values and there are q queries, a single query can read up to n values. Repeating that for all queries gives O(n * q) time in the worst case. The HashSet used for one distinct-count query can hold up to n values, so its helper space is O(n). The answer array takes O(q) space because it stores one result per query.

Mo’s Algorithm targets the repeated scan cost. We still answer the same queries, but we sort them first and carry an active range from one query to the next. Values that remain inside the new range do not need to be counted from scratch.

Block Ordering

Mo’s Algorithm groups queries by blocks of array indexes. The block for a query comes from its left endpoint. The usual block size is close to sqrt(n), where n is the array length.

For an array of length 16, a block size of 4 divides the indexes like this:

Indexes 0  1  2  3     block 0
Indexes 4  5  6  7     block 1
Indexes 8  9 10 11     block 2
Indexes 12 13 14 15    block 3

The query [2, 9] belongs to block 0 because its left endpoint is 2. The query [6, 12] belongs to block 1 because its left endpoint is 6. The right endpoint still affects sorting, but the first grouping comes from the block containing left.

We get the block number with integer division:

public static int leftBlock(int left, int blockSize) {
    return left / blockSize;
}

With blockSize set to 4, indexes 0 through 3 return 0, indexes 4 through 7 return 1, and the same rule continues across the array. That gives us a cheap way to group queries before they are answered.

After grouping by left block, queries inside the same block are sorted by their right endpoint. We can represent that ordering rule with a small query type:

static class Query {
    int left;
    int right;
    int originalIndex;

    Query(int left, int right, int originalIndex) {
        this.left = left;
        this.right = right;
        this.originalIndex = originalIndex;
    }
}

The originalIndex field is not part of the sorting rule itself. It exists because sorting changes the query order. Later, after a query has been answered, the result can still be placed back into the answer array position that matches the original input.

The sorting rule can compare left blocks first, then right endpoints:

int blockSize = (int) Math.sqrt(values.length) + 1;

Arrays.sort(queries, (first, second) -> {
    int firstBlock = first.left / blockSize;
    int secondBlock = second.left / blockSize;

    if (firstBlock != secondBlock) {
        return Integer.compare(firstBlock, secondBlock);
    }

    return Integer.compare(first.right, second.right);
});

We first compute the block for each left endpoint. If the blocks differ, the lower block comes first. If both queries start in the same block, the query with the smaller right endpoint comes first.

This sorting order helps reduce boundary movement. Moving a range boundary by one position means we need one update to the saved range state. If the range grows, a value enters that state. If the range shrinks, a value leaves it. The total running time depends heavily on how far those boundaries travel while processing all queries.

Without this reorder, query order can jump across the array. The active range could move from a far-left range to a far-right range, then back again. With block ordering, queries with nearby left endpoints are handled near one another. Inside a block, the right endpoint has a more controlled progression, so the active range has more chances to reuse values that were already counted for the previous query.

The range movement part of Mo’s Algorithm is commonly O((n + q) * sqrt(n)) when adding or removing a value costs O(1). Sorting the queries adds O(q log q) time. A fuller bound is O(q log q + (n + q) * sqrt(n)). The answer array needs O(q) space, while the saved frequency state commonly uses O(n) space after value compression or O(maxValue) space when the input values can directly index a frequency array.

Block ordering only chooses the processing order. The actual answer still comes from the active range state that gets updated as query boundaries move. That reorder is the first mechanical step in Mo’s Algorithm because it gives the later add and remove operations less repeated array reading.

Building the Active Window in Java

The active window is the range currently loaded into our counting state. In a distinct-count version of Mo’s Algorithm, that state usually contains a frequency array and a running distinct total. As we move from one sorted query to the next, we adjust the left and right boundaries until the window matches the requested range. Every boundary move updates the saved state right away, so the window contents and the current answer stay aligned.

Query Storage

Before sorting begins, we need a small object that keeps the query endpoints and the query’s original position. Sorting changes the order in which queries are processed, but it should not change the order in which answers are returned. If the third query gets answered first after sorting, its answer still belongs in the third slot of the output array.

The query object can stay small:

private static class Query {
    int left;
    int right;
    int index;

    Query(int left, int right, int index) {
        this.left = left;
        this.right = right;
        this.index = index;
    }
}

We treat left and right as inclusive indexes. That means a query of [2, 5] covers indexes 2, 3, 4, and 5. The index field stores where the query came from in the original input, which becomes useful after sorting changes the processing order.

We can build the query array from an int[][] range input like this:

Query[] queries = new Query[ranges.length];

for (int i = 0; i < ranges.length; i++) {
    int left = ranges[i][0];
    int right = ranges[i][1];

    queries[i] = new Query(left, right, i);
}

We read the endpoints from each row, then attach the current loop index as the original answer position. Later, after a query is answered, we write the result into answers[query.index]. That small index field lets us sort freely without losing the original output order.

For distinct-count queries, we also keep state outside the query object. The query tells us which range we want. The frequency array and the running count tell us what the current window contains. Keeping those pieces separate makes the flow easier to follow because the query data stays fixed while the active window state changes during processing.

Sorting the Queries

The block size is usually based on the square root of the array length. In Java, Math.sqrt() returns a double, so we cast the result back to int. Adding 1 keeps the block size from becoming 0 for very small input.

int blockSize = (int) Math.sqrt(values.length) + 1;

We sort queries by the block containing their left endpoint first, then by their right endpoint inside that block. This order keeps queries with nearby left endpoints near each other in the processing order, which reduces how far the active window has to travel.

The sort can be written with a Comparator:

Arrays.sort(queries, new Comparator<Query>() {
    @Override
    public int compare(Query first, Query second) {
        int firstBlock = first.left / blockSize;
        int secondBlock = second.left / blockSize;

        if (firstBlock != secondBlock) {
            return Integer.compare(firstBlock, secondBlock);
        }

        return Integer.compare(first.right, second.right);
    }
});

We first calculate the left block for both queries. If the block numbers differ, the smaller block number comes first. If both queries belong to the same left block, we order them by the right endpoint. That second comparison gives the right boundary a more controlled travel order inside each block.

The same sorting rule can also be written as a lambda when the shorter form fits the surrounding code:

Arrays.sort(queries, (first, second) -> {
    int firstBlock = first.left / blockSize;
    int secondBlock = second.left / blockSize;

    if (firstBlock != secondBlock) {
        return Integer.compare(firstBlock, secondBlock);
    }

    return Integer.compare(first.right, second.right);
});

Both versions produce the same order. The longer version gives the comparison method its own named block, which can help while we are reading the steps for the first time. The lambda version keeps the sort closer to the query setup while still following the same two-part rule.

Sorting costs O(q log q) time for q queries. After sorting, the boundary movement gives Mo’s Algorithm its usual O((n + q) * sqrt(n)) behavior when each add or remove operation costs O(1).

Moving the Boundaries

The active window starts empty. We can represent that by placing the left boundary at 0 and the right boundary at -1.

int currentLeft = 0;
int currentRight = -1;

That pair means no valid index is currently inside the range. Before answering a query, we move both boundaries until the active window matches the query’s inclusive [left, right] range.

The boundary update logic has four cases:

while (currentLeft > query.left) {
    currentLeft--;
    add(compressedValues[currentLeft]);
}

while (currentRight < query.right) {
    currentRight++;
    add(compressedValues[currentRight]);
}

while (currentLeft < query.left) {
    remove(compressedValues[currentLeft]);
    currentLeft++;
}

while (currentRight > query.right) {
    remove(compressedValues[currentRight]);
    currentRight--;
}

We add a value when an index enters the active range. We remove a value when an index leaves the active range. The pointer movement and the update call have to match the direction of travel. When expanding left, we move the pointer first because the new index becomes part of the range. When shrinking left, we remove the current left value first because that index leaves before the pointer advances.

The right side follows the same idea. Expanding right moves the pointer forward, then adds that new value. Shrinking right removes the current right value, then moves the pointer backward. After these loops finish, [currentLeft, currentRight] matches [query.left, query.right], so the saved answer state belongs to the current query.

For distinct counting, add and remove can update a frequency array and a running total:

private int distinctCount;
private int[] frequency;

private void add(int valueId) {
    if (frequency[valueId] == 0) {
        distinctCount++;
    }

    frequency[valueId]++;
}

private void remove(int valueId) {
    frequency[valueId]--;

    if (frequency[valueId] == 0) {
        distinctCount--;
    }
}

We increase distinctCount only when the value was absent before the add. After that, the frequency rises by one. During removal, we decrement the frequency first. If the count becomes 0, that value no longer appears in the active window, so the distinct total falls by one.

This state update takes O(1) time per boundary move when values have already been converted to compact ids. Large or negative values do not fit directly into a normal frequency array index, so the full version below maps input values before answering queries.

Complete Java Example

The full version below answers distinct-count range queries. It compresses input values first, sorts the queries in Mo order, then moves the active window while storing answers back in the original query order.

import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;

public class MosAlgorithmDistinct {

    private static class Query {
        int left;
        int right;
        int index;

        Query(int left, int right, int index) {
            this.left = left;
            this.right = right;
            this.index = index;
        }
    }

    private int[] compressedValues;
    private int[] frequency;
    private int distinctCount;

    public int[] countDistinctInRanges(int[] values, int[][] ranges) {
        compressedValues = compress(values);

        Query[] queries = new Query[ranges.length];

        for (int i = 0; i < ranges.length; i++) {
            queries[i] = new Query(ranges[i][0], ranges[i][1], i);
        }

        int blockSize = (int) Math.sqrt(values.length) + 1;

        Arrays.sort(queries, (first, second) -> {
            int firstBlock = first.left / blockSize;
            int secondBlock = second.left / blockSize;

            if (firstBlock != secondBlock) {
                return Integer.compare(firstBlock, secondBlock);
            }

            return Integer.compare(first.right, second.right);
        });

        frequency = new int[values.length];
        distinctCount = 0;

        int[] answers = new int[queries.length];
        int currentLeft = 0;
        int currentRight = -1;

        for (Query query : queries) {
            while (currentLeft > query.left) {
                currentLeft--;
                add(compressedValues[currentLeft]);
            }

            while (currentRight < query.right) {
                currentRight++;
                add(compressedValues[currentRight]);
            }

            while (currentLeft < query.left) {
                remove(compressedValues[currentLeft]);
                currentLeft++;
            }

            while (currentRight > query.right) {
                remove(compressedValues[currentRight]);
                currentRight--;
            }

            answers[query.index] = distinctCount;
        }

        return answers;
    }

    private void add(int valueId) {
        if (frequency[valueId] == 0) {
            distinctCount++;
        }

        frequency[valueId]++;
    }

    private void remove(int valueId) {
        frequency[valueId]--;

        if (frequency[valueId] == 0) {
            distinctCount--;
        }
    }

    private int[] compress(int[] values) {
        int[] sorted = values.clone();
        Arrays.sort(sorted);

        Map<Integer, Integer> ids = new HashMap<>();
        int nextId = 0;

        for (int value : sorted) {
            if (!ids.containsKey(value)) {
                ids.put(value, nextId);
                nextId++;
            }
        }

        int[] result = new int[values.length];

        for (int i = 0; i < values.length; i++) {
            result[i] = ids.get(values[i]);
        }

        return result;
    }

    public static void main(String[] args) {
        int[] values = {5, -2, 5, 7, -2, 9, 5};

        int[][] ranges = {
            {0, 2},
            {1, 4},
            {2, 6},
            {3, 5}
        };

        MosAlgorithmDistinct solver = new MosAlgorithmDistinct();
        int[] answers = solver.countDistinctInRanges(values, ranges);

        System.out.println(Arrays.toString(answers));
    }
}

The output is:

[2, 3, 4, 3]

We can read the sample ranges directly. The range [0, 2] contains 5, -2, and 5, so it has 2 distinct values. The range [1, 4] contains -2, 5, 7, and -2, giving 3 distinct values. The range [2, 6] contains 5, 7, -2, 9, and 5, giving 4 distinct values. The range [3, 5] contains 7, -2, and 9, giving 3 distinct values.

The compression method keeps the frequency array compact. We clone and sort the input values, assign each distinct value a small id, then replace every original value with its id. For the sample input, the distinct values are -2, 5, 7, and 9, so they can be stored as ids from 0 through 3. The frequency array counts those ids instead of the original numbers.

The total time has three main parts. Compression sorts the values, so it costs O(n log n) time. Query sorting costs O(q log q) time. The active-window movement costs O((n + q) * sqrt(n)) when add and remove run in O(1) time. That gives a full time bound of O(n log n + q log q + (n + q) * sqrt(n)). The extra space is O(n + q) for the compressed values, frequency array, query storage, and answer array.

Conclusion

Mo’s Algorithm works by sorting static range queries into an order that keeps the active window from jumping across the array more than needed. The left and right boundaries move toward each requested range, while add and remove update the saved frequency state as values enter or leave. For distinct counts, that means the answer comes from the current frequency table instead of a fresh scan for every query. With value compression included, the full cost is O(n log n + q log q + (n + q) * sqrt(n)) time and O(n + q) extra space.

Share Alexander Obregon's Substack

Alexander Obregon's Substack

Discussion about this post

Ready for more?