Hopcroft Karp Matching in Java

May 22, 2026

Graph matching appears in job assignment, course registration, pairing engines, and other cases where one side needs to connect to the other without reusing the same node. In a bipartite graph, every edge crosses from a left partition to a right partition, which makes matching more manageable than it is in a general graph. Hopcroft-Karp fits this graph type because it searches in phases instead of chasing augmenting paths one at a time. It builds layers with BFS, then sends DFS through only the shortest valid routes found during that phase. That reduces the running time to O(E * sqrt(V)), where V is the total number of vertices and E is the number of edges. The space cost is O(V + E) for the adjacency lists, pairing arrays, and distance array. In Java, the algorithm fits naturally with adjacency lists, primitive arrays, and a queue, which makes it useful for interviews and production-style matching problems.

Bipartite Matching Basics

The matching idea starts with ownership. We choose edges from a graph, but no chosen edge can reuse a vertex that already belongs to a chosen edge. That rule turns the graph into a pairing problem rather than a reachability problem. For Hopcroft-Karp, the graph must be bipartite, meaning the vertices are split into two partitions and every valid edge crosses between them. From there, we improve the current matching by finding an alternating route that starts from a free left vertex and ends at a free right vertex, then flipping which edges are selected along that route.

What Makes a Graph Bipartite

Bipartite graphs have two separate vertex groups. We can call them the left side and the right side. Every edge crosses from one side to the other, so the graph has no left-to-left edges and no right-to-right edges. That split gives matching algorithms a structure they can read directly from the input.

For matching problems, the left side usually represents one category, while the right side represents a second category. In a course registration problem, the left side could hold students and the right side could hold lab sections. In a job assignment problem, the left side could hold workers and the right side could hold shifts. An edge means that pairing is allowed. If left vertex u has an edge to right vertex v, then u can be matched with v.

The matching itself is a group of chosen edges where no vertex appears more than a single time. If u1 is matched to v2, then u1 cannot also be matched to v3, and v2 cannot also be matched to u4. This is why a matching represents final pairings rather than a list of every available option.

We can represent the left-to-right edge list in Java without storing the right side as a second full adjacency structure:

import java.util.ArrayList;
import java.util.List;

public class BipartiteChoices {
    private final List<List<Integer>> edgesFromLeft;

    public BipartiteChoices(int leftCount) {
        edgesFromLeft = new ArrayList<>();

        for (int i = 0; i <= leftCount; i++) {
            edgesFromLeft.add(new ArrayList<>());
        }
    }

    public void addAllowedPair(int leftVertex, int rightVertex) {
        edgesFromLeft.get(leftVertex).add(rightVertex);
    }

    public List<Integer> rightChoicesFor(int leftVertex) {
        return edgesFromLeft.get(leftVertex);
    }
}

With this class, we store only edges that start from the left side. That is enough for a bipartite matching algorithm because every value inside an inner list points to a right-side vertex. The graph still has two partitions, but the adjacency list stays compact and easy to scan.

Vertex numbering in matching code commonly starts at 1, with 0 reserved as a marker for no current match. That convention helps later because an array value of 0 can mean the vertex is free. The graph itself does not require this numbering plan, but it keeps the matching state small by letting us store partner ids in primitive int arrays.

We can also write a small check for the matching rule itself. This does not find the best matching. It only checks that the chosen pairs do not reuse any left or right vertex:

public static boolean isValidMatching(int leftCount, int rightCount, int[][] chosenPairs) {
    boolean[] usedLeft = new boolean[leftCount + 1];
    boolean[] usedRight = new boolean[rightCount + 1];

    for (int[] pair : chosenPairs) {
        int left = pair[0];
        int right = pair[1];

        if (usedLeft[left] || usedRight[right]) {
            return false;
        }

        usedLeft[left] = true;
        usedRight[right] = true;
    }

    return true;
}

A pair list such as { {1, 2}, {2, 3}, {3, 1} } is valid if all left vertices and all right vertices appear at most a single time. The pair list { {1, 2}, {1, 3} } fails because left vertex 1 is reused. The pair list { {1, 2}, {3, 2} } fails because right vertex 2 is reused. This validity check runs in O(M) time, where M is the number of chosen pairs, because we scan the chosen edges a single time. The space cost is O(L + R) for the two boolean arrays, where L is the number of left vertices and R is the number of right vertices.

Maximum matching asks for the largest valid set of pairs. We cannot just accept the first valid pairs we find, because an early pair can block a better final result later. Augmentation gives us a legal way to revise earlier choices while growing the matching.

What Augmentation Changes

An augmenting route starts at a free vertex on the left side and ends at a free vertex on the right side. Free means the vertex does not belong to the current matching. The route alternates between edges that are not currently matched and edges that are currently matched. That alternating order is what lets the matching grow by a single pair.

Think of the current matching as a set of commitments. An augmenting route finds a chain where a free left vertex can take an unmatched edge, then an already matched right vertex gives up its current left partner, then that left partner moves to a different right vertex. The chain keeps following that alternating rule until it reaches a free right vertex. After that, we flip the selected status of the edges along the route. Edges that were outside the matching become selected, and edges that were selected are removed.

The result grows by exactly a single pair because the route starts and ends at free vertices. Along the middle of the route, selected and unselected edges cancel each other out. The two ends create the gain.

This small Java example tracks the flip on one augmenting route. The route is written as alternating left and right vertex ids. The current matching has left 2 matched to right 1, while left 1 and right 2 are free before the flip:

public class AugmentingRouteFlip {
    public static void main(String[] args) {
        int[] pairU = new int[4];
        int[] pairV = new int[4];

        pairU[2] = 1;
        pairV[1] = 2;

        int[] leftRoute = {1, 2};
        int[] rightRoute = {1, 2};

        for (int i = 0; i < leftRoute.length; i++) {
            int left = leftRoute[i];
            int right = rightRoute[i];

            pairU[left] = right;
            pairV[right] = left;
        }

        System.out.println(pairU[1]);
        System.out.println(pairU[2]);
    }
}

We can read the route behind that code as left 1 to right 1, then through the current match from right 1 back to left 2, then from left 2 to right 2. After the flip, left 1 is matched to right 1, and left 2 is matched to right 2. The previous match between left 2 and right 1 disappears because right 1 now belongs to left 1.

The code assigns the new pairs in route order, which is enough for this tiny case because the old middle match gets overwritten. A full matching algorithm has to find the augmenting route before it can flip it, but the change itself follows the same idea. We gain one more edge while still obeying the no-reuse rule on both sides.

The reason this is safe comes from the alternating layout. Every internal vertex on the route touches one old matched edge and one new matched edge after the flip. That means no internal vertex ends up with two matches. The first left vertex was free before the flip, so gaining a match is valid. The last right vertex was free before the flip, so gaining a match is valid as well.

If no augmenting route exists, the current matching is already maximum. This is the main theorem behind augmenting-route matching algorithms. Hopcroft-Karp builds on that fact by searching for several shortest augmenting routes in phases, but the basic operation stays the same. We find a valid alternating route from a free left vertex to a free right vertex, flip the selected edges along that route, and increase the matching size by one.

Building Hopcroft Karp in Java

The Java version becomes much easier to follow when we separate the algorithm state from the search steps. The graph tells us which left-side vertices can connect to which right-side vertices. The pairing arrays tell us the current matching. The distance array belongs to the current BFS phase, and it tells DFS which routes are still allowed during that phase.

Data Layout for the Left Side

We store the graph from the left side only. Each left vertex owns a list of right vertices it can reach, so graph.get(u) gives every right-side option for left vertex u. That fits the bipartite graph layout because every edge crosses from left to right. We do not need a second adjacency list for the right side to run Hopcroft-Karp. Numbering the vertices from 1 keeps the matching arrays readable. Index 0 can stay reserved as NIL, meaning no partner. If pairU[u] == 0, then left vertex u is unmatched. If pairV[v] == 0, then right vertex v is unmatched. That convention also lets the code use primitive int arrays instead of wrapper objects or nullable values.

We can start with the fields that hold the graph and matching state:

private static final int NIL = 0;
private static final int INF = Integer.MAX_VALUE;

private final int leftSize;
private final int rightSize;
private final List<List<Integer>> graph;
private final int[] pairU;
private final int[] pairV;
private final int[] dist;

The pairU array maps a left vertex to its matched right vertex. The pairV array maps a right vertex back to its matched left vertex. Keeping both directions matters because BFS and DFS need to move from a right vertex back to the left vertex that currently owns it.

The constructor sets up one list for every left vertex, including the unused 0 slot. That extra slot avoids constant index adjustments and keeps the vertex ids the same in the input and in the arrays:

public HopcroftKarp(int leftSize, int rightSize) {
    this.leftSize = leftSize;
    this.rightSize = rightSize;
    this.graph = new ArrayList<>(leftSize + 1);

    for (int u = 0; u <= leftSize; u++) {
        graph.add(new ArrayList<>());
    }

    this.pairU = new int[leftSize + 1];
    this.pairV = new int[rightSize + 1];
    this.dist = new int[leftSize + 1];
}

The adjacency list takes O(E) space, where E is the number of edges. The arrays take O(V) space, where V is the total number of vertices across both partitions. The total space cost is O(V + E), which is also the space bound for the full implementation.

Edge insertion should reject vertex ids outside the declared range. That keeps invalid input from turning into an array access error later during the matching search:

public void addEdge(int u, int v) {
    if (u < 1 || u > leftSize || v < 1 || v > rightSize) {
        throw new IllegalArgumentException("Vertex out of range");
    }

    graph.get(u).add(v);
}

With that storage in place, the rest of the algorithm can treat graph.get(u) as the complete set of possible right-side partners for u. The matching state then changes only through pairU and pairV, while the graph itself stays fixed.

Layered BFS

The BFS step builds the layer information for the current phase. We begin from every unmatched left vertex because those are the only places where a new augmenting route can start. Matched left vertices are not starting points, so their distance begins as INF. During the search, we scan edges from a left vertex u to right vertices v. If a right vertex is unmatched, then we have reached the end of a shortest augmenting route for this phase. If the right vertex is already matched, then we can move to its matched left partner and place that left vertex in the next layer.

The method below records the first distance where an unmatched right vertex is reached. The queue still processes left vertices already admitted to valid layers, but it avoids expanding beyond the shortest augmenting route length for the phase:

private boolean bfs() {
    Queue<Integer> queue = new ArrayDeque<>();
    Arrays.fill(dist, INF);

    for (int u = 1; u <= leftSize; u++) {
        if (pairU[u] == NIL) {
            dist[u] = 0;
            queue.offer(u);
        }
    }

    while (!queue.isEmpty()) {
        int u = queue.poll();

        if (dist[u] < dist[NIL]) {
            for (int v : graph.get(u)) {
                int matchedLeft = pairV[v];

                if (dist[matchedLeft] == INF) {
                    dist[matchedLeft] = dist[u] + 1;

                    if (matchedLeft != NIL) {
                        queue.offer(matchedLeft);
                    }
                }
            }
        }
    }

    return dist[NIL] != INF;
}

We store distances for left-side vertices and use dist[NIL] as the marker for the shortest route to an unmatched right vertex. The right vertex acts as the bridge. If v is matched to matchedLeft, then matchedLeft can be visited at dist[u] + 1. This layer rule prevents later search from roaming through every reachable area of the graph. During the current phase, DFS should only follow routes that match the shortest augmenting length discovered by BFS. That is the main reason Hopcroft-Karp is faster than repeatedly finding a single augmenting route with no phase structure.

The cost of one BFS phase is O(E). Every edge from the left-side adjacency lists can be inspected during the pass, and the queue can hold at most O(L) left vertices, where L is the number of left-side vertices. The extra space inside BFS is O(L) for the queue, while the reusable dist array already belongs to the full O(V) state.

Targeted DFS

The DFS step tries to turn the layers from BFS into actual augmenting routes. It starts from an unmatched left vertex and follows edges that respect the layer order. If it reaches an unmatched right vertex, it can flip the route by assigning new partners while the recursion returns.

The layer test is dist[matchedLeft] == dist[u] + 1. That line allows DFS to move into the next valid layer, including NIL when the edge reaches an unmatched right vertex. Without that test, DFS could spend time on routes that do not match the shortest structure found by the latest BFS.

The recursive method stays compact because the pairing arrays carry all matching state:

private boolean dfs(int u) {
    if (u == NIL) {
        return true;
    }

    for (int v : graph.get(u)) {
        int matchedLeft = pairV[v];

        if (dist[matchedLeft] == dist[u] + 1 && dfs(matchedLeft)) {
            pairU[u] = v;
            pairV[v] = u;
            return true;
        }
    }

    dist[u] = INF;
    return false;
}

We can read the assignment block as the flip step. When the method finds a free right vertex, or finds a deeper route that can end at one, pairU[u] = v and pairV[v] = u select the current edge. If v used to belong to another left vertex, that earlier call is repaired deeper in the recursion before the current assignment returns.

The final two lines also matter. When no edge from u can lead to a valid augmenting route in the current phase, dist[u] = INF marks it as dead for this round. Later DFS calls can skip that vertex because it no longer belongs to any useful route under the current layering.

One DFS call can scan several edges, but across a phase the failed layer markings keep repeated scans under control. The combined DFS effort for one phase is O(E), and the recursion depth is bounded by the number of left vertices, so the call stack can reach O(L) in the deepest case. For very large graphs, an iterative version can avoid deep recursion, but the recursive form is common for teaching and interview settings.

Full Java Implementation

The complete class brings the storage, BFS, DFS, and public matching method into one place. The algorithm repeatedly builds layers, then tries to augment from every unmatched left vertex that can still participate in the current phase. When bfs() returns false, no augmenting route remains, so the current matching is maximum.

This version uses ArrayList for adjacency lists, ArrayDeque for the BFS queue, and primitive arrays for matching state:

import java.util.ArrayDeque;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Queue;

public class HopcroftKarp {

    private static final int NIL = 0;
    private static final int INF = Integer.MAX_VALUE;

    private final int leftSize;
    private final int rightSize;
    private final List<List<Integer>> graph;
    private final int[] pairU;
    private final int[] pairV;
    private final int[] dist;

    public HopcroftKarp(int leftSize, int rightSize) {
        this.leftSize = leftSize;
        this.rightSize = rightSize;
        this.graph = new ArrayList<>(leftSize + 1);

        for (int u = 0; u <= leftSize; u++) {
            graph.add(new ArrayList<>());
        }

        this.pairU = new int[leftSize + 1];
        this.pairV = new int[rightSize + 1];
        this.dist = new int[leftSize + 1];
    }

    public void addEdge(int u, int v) {
        if (u < 1 || u > leftSize || v < 1 || v > rightSize) {
            throw new IllegalArgumentException("Vertex out of range");
        }

        graph.get(u).add(v);
    }

    public int maximumMatching() {
        int matching = 0;

        while (bfs()) {
            for (int u = 1; u <= leftSize; u++) {
                if (pairU[u] == NIL && dfs(u)) {
                    matching++;
                }
            }
        }

        return matching;
    }

    private boolean bfs() {
        Queue<Integer> queue = new ArrayDeque<>();
        Arrays.fill(dist, INF);

        for (int u = 1; u <= leftSize; u++) {
            if (pairU[u] == NIL) {
                dist[u] = 0;
                queue.offer(u);
            }
        }

        while (!queue.isEmpty()) {
            int u = queue.poll();

            if (dist[u] < dist[NIL]) {
                for (int v : graph.get(u)) {
                    int matchedLeft = pairV[v];

                    if (dist[matchedLeft] == INF) {
                        dist[matchedLeft] = dist[u] + 1;

                        if (matchedLeft != NIL) {
                            queue.offer(matchedLeft);
                        }
                    }
                }
            }
        }

        return dist[NIL] != INF;
    }

    private boolean dfs(int u) {
        if (u == NIL) {
            return true;
        }

        for (int v : graph.get(u)) {
            int matchedLeft = pairV[v];

            if (dist[matchedLeft] == dist[u] + 1 && dfs(matchedLeft)) {
                pairU[u] = v;
                pairV[v] = u;
                return true;
            }
        }

        dist[u] = INF;
        return false;
    }

    public int matchedRightForLeft(int u) {
        if (u < 1 || u > leftSize) {
            throw new IllegalArgumentException("Left vertex out of range");
        }

        return pairU[u];
    }

    public static void main(String[] args) {
        HopcroftKarp matcher = new HopcroftKarp(4, 4);

        matcher.addEdge(1, 1);
        matcher.addEdge(1, 2);
        matcher.addEdge(2, 1);
        matcher.addEdge(2, 3);
        matcher.addEdge(3, 2);
        matcher.addEdge(3, 4);
        matcher.addEdge(4, 3);

        int maxMatching = matcher.maximumMatching();
        System.out.println("Maximum matching size = " + maxMatching);

        for (int u = 1; u <= 4; u++) {
            System.out.println("Left " + u + " is matched with Right " + matcher.matchedRightForLeft(u));
        }
    }
}

We start the example graph with four vertices on the left and four on the right. The calls to addEdge() record allowed pairings only. They do not create matches by themselves. The matches are chosen later by maximumMatching() after the algorithm has searched through augmenting routes.

Inside maximumMatching(), every successful dfs(u) call increases the matching count by one. That is safe because dfs(u) returns true only after it has changed pairU and pairV to add one more valid pair. The loop only starts DFS from unmatched left vertices, which matches the augmenting-route rule from the graph theory side.

The full running time is O(E * sqrt(V)), where E is the number of edges and V is the total number of vertices across both partitions. Each phase costs O(E) across BFS and DFS, and the number of phases is bounded by O(sqrt(V)). The space cost is O(V + E) for the adjacency lists, pairing arrays, distance array, queue, and recursion stack.

Conclusion

Hopcroft-Karp gets its speed from the way it groups matching changes into phases. The search starts from unmatched left-side vertices, uses BFS to build the shortest valid layers, then lets DFS follow only routes that fit those layers. When a route reaches an unmatched right-side vertex, the selected and unselected edges are flipped, which grows the matching by one without reusing any vertex. Repeating that phase structure gives the algorithm its O(E * sqrt(V)) time bound, while the Java version keeps the state compact with adjacency lists, pairU, pairV, dist, and a queue.

Share Alexander Obregon's Substack

Alexander Obregon's Substack

Discussion about this post

Ready for more?