Helpful context:


During World War II, the Soviet Union moved supplies - ammunition, food, fuel - along railroad lines from central cities to the front. US military planners, thinking about how to disrupt the supply chain, wanted to answer a precise question: what is the maximum tonnage of material that can move from the interior to the front per day?

And they noticed something: that number is exactly equal to a completely different quantity - the minimum number of rail lines you’d need to cut to halt all supply movement entirely.

These two problems, which seem unrelated at first, have the same answer. In 1956, L. R. Ford Jr. and D. R. Fulkerson proved this rigorously. Their max-flow min-cut theorem is one of the most elegant results in combinatorial optimization - and it has applications far beyond military logistics. It also turns out to be a special case of something much deeper: LP duality.


Flow Networks

A flow network is a directed graph $G = (V, E)$ with:

  • A capacity function $c(u, v) \geq 0$ on each edge
  • A designated source $s$ (where flow originates)
  • A designated sink $t$ (where flow terminates)

A flow is an assignment $f(u, v)$ to each edge satisfying two conditions:

Capacity constraint: $0 \leq f(u, v) \leq c(u, v)$ for all edges.

Flow conservation: For every vertex $v$ other than $s$ and $t$, the total flow in equals the total flow out: $$\sum_u f(u, v) = \sum_w f(v, w)$$

The value of the flow is the total amount leaving the source: $$|f| = \sum_v f(s, v) - \sum_v f(v, s)$$

The max-flow problem: Find a feasible flow of maximum value.

Example. Consider the following network with edge capacities labeled.

flowchart LR s(["s"]) -->|4| A(["A"]) s -->|2| B(["B"]) A -->|3| t(["t"]) A -->|1| B B -->|2| t

The network can carry at most 5 units from $s$ to $t$: route 3 units along $s \to A \to t$ (saturating $A \to t$) and 2 units along $s \to B \to t$ (saturating both $s \to B$ and $B \to t$). One unit of capacity remains on $s \to A$, but every path continuing from $A$ toward $t$ is blocked. Maximum flow is 5. We will see this confirmed by the min-cut shortly.


The Residual Graph

To reason about how to improve a flow, we use the residual graph $G_f$. For each edge $(u, v)$ with capacity $c$ and current flow $f$:

  • Add a forward edge $(u, v)$ with residual capacity $c - f$ (room to push more flow)
  • Add a backward edge $(v, u)$ with residual capacity $f$ (room to cancel existing flow)

An augmenting path is any path from $s$ to $t$ in the residual graph. The bottleneck of the path is the minimum residual capacity along it. Pushing flow along an augmenting path increases the total flow by the bottleneck amount.

After routing 3 units $s \to A \to t$ and 2 units $s \to B \to t$, the residual graph (showing only edges with nonzero residual capacity) is:

flowchart LR s(["s"]) -->|"1"| A(["A"]) A -->|"3"| s B -->|"2"| s A -->|"1"| B(["B"]) t(["t"]) -->|"3"| A t -->|"2"| B

From $s$ we can reach $A$ (1 unit of forward capacity), and from $A$ we can reach $B$ (1 unit). But $A \to t$ and $B \to t$ are both saturated: $t$ is unreachable from $s$. No augmenting path exists - the flow is optimal.

The backward edges (like $A \to s$ with capacity 3) encode how much previously routed flow could be “undone” if a future path needed to reroute it.


Ford-Fulkerson: Augmenting Paths

The Ford-Fulkerson method repeatedly finds augmenting paths and pushes flow along them until no augmenting path remains. Using BFS to find each path (the Edmonds-Karp choice) gives the polynomial $O(VE^2)$ bound:

from collections import deque, defaultdict

def ford_fulkerson(capacity, source, sink):
    # capacity[u][v] = capacity of edge u -> v
    # Build residual graph as a copy of capacity (missing edges default to 0)
    residual = defaultdict(lambda: defaultdict(int))
    for u in capacity:
        for v, cap in capacity[u].items():
            residual[u][v] += cap

    def bfs():
        # Returns parent map if augmenting path exists, else None
        parent = {source: None}
        queue = deque([source])
        while queue:
            u = queue.popleft()
            for v, cap in residual[u].items():
                if v not in parent and cap > 0:
                    parent[v] = u
                    if v == sink:
                        return parent
                    queue.append(v)
        return None

    max_flow = 0
    while True:
        parent = bfs()
        if parent is None:
            break
        # Trace path back from sink to find bottleneck capacity
        bottleneck = float('inf')
        v = sink
        while v != source:
            u = parent[v]
            bottleneck = min(bottleneck, residual[u][v])
            v = u
        # Push bottleneck units along path, update residual graph
        v = sink
        while v != source:
            u = parent[v]
            residual[u][v] -= bottleneck
            residual[v][u] += bottleneck  # backward edge
            v = u
        max_flow += bottleneck

    return max_flow

When no augmenting path exists from $s$ to $t$ in the residual graph, we have found a maximum flow.

But why? This is the content of the max-flow min-cut theorem.


Max-Flow Min-Cut

Max-flow min-cut theorem. The value of a maximum flow equals the capacity of a minimum $s$-$t$ cut.

An $s$-$t$ cut $(S, T)$ partitions $V$ into $S$ (containing $s$) and $T$ (containing $t$). Its capacity is the total capacity of edges directed from $S$ to $T$:

$$\text{cap}(S, T) = \sum_{\substack{u \in S, v \in T \\ (u,v) \in E}} c(u, v)$$

Flow is always $\leq$ cut capacity. Any unit of flow traveling from $s$ to $t$ must cross the cut boundary at least once - it starts in $S$ and ends in $T$, so at some point it uses an edge from $S$ to $T$. Since every such crossing uses edge capacity, the total flow is bounded by the total capacity of edges crossing the cut. This holds for every cut: the max flow is a lower bound, every cut is an upper bound.

At the optimum they meet. When Ford-Fulkerson terminates, let $S$ = vertices reachable from $s$ in the residual graph, $T = V \setminus S$. Since $t$ is not reachable, $(S, T)$ is a valid cut. Every forward edge from $S$ to $T$ must be saturated - if any residual capacity remained, its endpoint would be in $S$. Every backward edge from $T$ to $S$ carries zero flow - otherwise the residual graph would contain a corresponding forward arc from $S$ reaching further into $T$. Therefore: the current flow value equals this cut’s capacity. Since flow $\leq$ any cut and this flow equals this cut, both are optimal simultaneously.

In the example, $S = \{s, A, B\}$, $T = \{t\}$:

flowchart LR subgraph S ["S = {s, A, B}"] s(["s"]) A(["A"]) B(["B"]) end subgraph T ["T = {t}"] t(["t"]) end s -->|4| A s -->|2| B A -.->|"3 - saturated"| t A -->|1| B B -.->|"2 - saturated"| t

The cut capacity is $c(A, t) + c(B, t) = 3 + 2 = 5$, matching the max flow exactly. The dashed edges are the ones crossing the cut - both saturated, as expected.

Complexity. With integer capacities and maximum flow $F$: each augmentation increases the flow by at least 1, so at most $F$ augmentations, each taking $O(E)$ to find a path. Total: $O(EF)$. This is pseudopolynomial - if $F$ is large, this is slow.


The LP View: Why Max Equals Min

The max-flow min-cut theorem is often presented as a clever combinatorial argument. But it is actually a consequence of LP duality - a general theorem about linear programs. Seeing the connection reveals why the equality is structurally inevitable rather than a happy coincidence.

Max-Flow as a Linear Program

The max-flow problem has a natural LP formulation. Variables are the edge flows $f(u, v)$; the objective is to maximize the net flow out of $s$:

$$\textbf{Maximize} \quad \sum_{v} f(s, v)$$

$$\textbf{subject to} \quad f(u, v) \leq c(u, v) \quad \text{for all } (u, v) \in E \tag{capacity}$$

$$\sum_{u} f(u, v) = \sum_{w} f(v, w) \quad \text{for all } v \neq s, t \tag{conservation}$$

$$f(u, v) \geq 0$$

Every constraint is linear in the flow variables. This is a valid LP with $|E|$ variables.

The Dual LP Is Min-Cut

Every LP has a companion called its dual: a new optimization problem whose variables correspond to the constraints of the original. LP strong duality says the optimal values of primal and dual always agree (when both are feasible and bounded, which they are here).

To build the dual of max-flow, assign a multiplier to each constraint:

  • $\alpha(u, v) \geq 0$ for each capacity constraint - think of this as “how much of edge $(u, v)$ is in the cut”
  • $d(v)$ free for each conservation constraint - a “potential” at vertex $v$, normalized so $d(s) = 1$ and $d(t) = 0$

Working through the standard dual construction, the dual LP is:

$$\textbf{Minimize} \quad \sum_{(u,v) \in E} c(u, v) \cdot \alpha(u, v)$$

$$\textbf{subject to} \quad \alpha(u, v) \geq d(u) - d(v) \quad \text{for all } (u, v) \in E$$

$$d(s) = 1, \quad d(t) = 0, \quad \alpha(u, v) \geq 0, \quad d(v) \geq 0$$

Interpreting the dual variables. Think of $d(v)$ as measuring how firmly $v$ belongs to the source side: $d(s) = 1$ is fully source-side, $d(t) = 0$ is fully sink-side. The constraint $\alpha(u, v) \geq d(u) - d(v)$ says: if edge $(u, v)$ drops in potential (crosses from the source side toward the sink side), you must “pay” at least the potential drop. The objective minimizes total payment, weighted by capacity.

In the integer case: $d(v) \in \{0, 1\}$ indicates which side of a cut $v$ belongs to, and $\alpha(u, v) = 1$ for each edge going from $S$ (where $d = 1$) to $T$ (where $d = 0$). The dual objective then equals the cut capacity exactly. So the dual LP is a fractional relaxation of the min-cut problem.

The two programs are:

  • Primal: fractional max-flow
  • Dual: fractional min-cut

Strong LP duality immediately gives: fractional max flow = fractional min cut.

Integrality via Total Unimodularity

LP duality gives the result for fractional solutions, but we want integer flows and integer cuts. This is resolved by total unimodularity.

The constraint matrix of the max-flow LP is the edge-vertex incidence matrix of a directed graph. It is totally unimodular (TU): every square submatrix has determinant $0$, $+1$, or $-1$. By a classical theorem of LP, when the constraint matrix is TU and all right-hand sides are integers, every extreme point of the feasible polyhedron is an integer vector. Since LP optima occur at extreme points:

$$\text{LP max flow} = \text{integer max flow}, \qquad \text{LP min cut} = \text{integer min cut}$$

Combined with LP strong duality:

$$\text{integer max flow} = \text{integer min cut capacity}$$

This is the max-flow min-cut theorem - derived entirely from LP structure, with no combinatorial cleverness required.

Complementary Slackness: The Ford-Fulkerson Certificate

LP duality comes with an optimality certificate: complementary slackness. A primal-dual pair $(f, (d, \alpha))$ is simultaneously optimal if and only if:

Primal CS: $\alpha(u, v) > 0 \Rightarrow f(u, v) = c(u, v)$. If an edge is “in the cut” (positive dual weight), it must be saturated.

Dual CS: $f(u, v) > 0 \Rightarrow \alpha(u, v) = d(u) - d(v)$. If an edge carries flow, the dual constraint must be tight.

Ford-Fulkerson terminates when $t$ is unreachable from $s$ in the residual. At that moment, define $S$ = reachable vertices from $s$, $T = V \setminus S$, and set $d(v) = 1$ for $v \in S$, $d(v) = 0$ for $v \in T$, $\alpha(u, v) = 1$ for edges from $S$ to $T$.

  • Every edge from $S$ to $T$ is saturated (otherwise it would have residual capacity and its endpoint would be reachable into $S$). Primal CS holds.
  • Every edge from $T$ to $S$ carries zero flow (otherwise the corresponding backward arc in the residual graph would make that vertex reachable from $s$). Dual CS holds.

Both conditions are met simultaneously - the current flow and the induced cut are both optimal. Ford-Fulkerson does not merely produce a maximum flow by luck; it terminates precisely when the LP optimality conditions are satisfied.

The Primal-Dual View of Augmenting Paths

The primal-dual method is a general LP algorithm framework: maintain dual feasibility; find augmenting directions in the “tight subgraph” (edges where the dual constraint holds with equality); augment the primal; repeat.

Ford-Fulkerson is exactly this framework applied to flow. The dual solution is the cut potential $d(v)$. Tight edges are those with remaining residual capacity. Augmenting paths are the primal improving directions. When no augmenting path exists, the dual solution certifies optimality through complementary slackness.

The same primal-dual structure underlies Dijkstra’s algorithm (dual variables are shortest-path distances), bipartite matching (dual variables are vertex covers), and many other combinatorial algorithms. What looks like a clever graph trick is, in each case, an instance of one mathematical principle.


Edmonds-Karp: Polynomial Guarantee

Edmonds and Karp fixed the inefficiency of Ford-Fulkerson by always finding augmenting paths using BFS (shortest path in terms of number of edges):

Edmonds-Karp runs in $O(VE^2)$ time, independent of flow values.

The key insight: BFS finds the shortest augmenting path. Under BFS, the distance from $s$ to any vertex in the residual graph never decreases across augmentations. Each edge can become a bottleneck at most $O(V)$ times before the $s$-to-$t$ distance strictly increases. Since distances are bounded by $V$, at most $O(VE)$ total augmentations occur. Each takes $O(E)$. Total: $O(VE^2)$.

This is a polynomial bound, making Edmonds-Karp practical for problems where flow values are large.


Bipartite Matching via Max-Flow

A bipartite graph has vertices split into two sets $U$ and $W$ with edges only between sets (never within). A matching is a set of edges with no two sharing an endpoint. A maximum matching has the largest possible number of edges.

Reduction to max-flow. Build a flow network: add source $s$ with edges $c(s, u) = 1$ for each $u \in U$; add sink $t$ with edges $c(w, t) = 1$ for each $w \in W$; set $c(u, w) = 1$ for each $(u, w) \in E$. Run max-flow. The maximum integer flow equals the size of the maximum matching.

Why does this work? Each unit of flow from $s$ to $t$ represents one matching edge: a unit flows from $s$ into $u$, then along a matching edge to some $w$, then to $t$. The capacity-1 constraints ensure no vertex is used twice.

König’s theorem. In a bipartite graph: $$\text{maximum matching size} = \text{minimum vertex cover size}$$

A vertex cover is a set of vertices touching every edge. This is a direct corollary of max-flow min-cut: a minimum cut in the flow network corresponds exactly to a minimum vertex cover in the bipartite graph. Through the LP lens, König’s theorem is LP duality applied to the bipartite matching LP - the same duality principle, a different combinatorial surface.

Discomfort check. Why do we need backward edges in the residual graph? They feel artificial - why would we ever “un-route” flow? Without backward edges, the greedy choice of the first augmenting path might be irreversible. Suppose you send flow along path $s \to A \to B \to t$, but the optimal solution needed flow along $s \to A \to C \to t$ and $s \to D \to B \to t$. After routing through $A \to B$, the greedy path has committed that edge. Backward edges let you “undo” that commitment: a later augmentation can route $s \to D \to B \to A \to C \to t$, where the $B \to A$ backward edge effectively cancels the earlier $A \to B$ routing. The residual graph is not artificial - it encodes all possible ways to reroute existing flow.


Applications

Airline scheduling. Assign aircraft to routes to maximize flights flown subject to maintenance windows, crew hours, and aircraft availability. This reduces to a flow problem on a time-expanded graph.

Image segmentation. Model each pixel as a vertex; add edges to source $s$ (foreground) and sink $t$ (background) with capacities encoding how likely each pixel belongs to each class. Add edges between adjacent pixels with capacities encoding color similarity. A minimum $s$-$t$ cut partitions pixels into foreground and background - minimizing the “cost” of separation while respecting how similar neighboring pixels are. This is the GrabCut algorithm.

Kidney exchange. Patients need kidney transplants but their willing donors are incompatible with them. Model donors and patients as vertices; add an edge if donor $D_i$’s kidney is compatible with patient $P_j$. Maximum bipartite matching finds the maximum number of transplants that can happen via compatible donor swaps.

Sports elimination. Team $i$ is mathematically eliminated if no combination of remaining game outcomes can get them to first place. This reduces to a max-flow problem: if the max flow is less than the total remaining games among a relevant subset of teams, elimination is confirmed.

Currency arbitrage. Model currencies as vertices and exchange rates as edge weights $-\log(\text{rate})$. A negative cycle in this graph means a profit-generating sequence of conversions. The same LP duality that underlies max-flow min-cut appears here too: Bellman-Ford’s dual variables are exactly the negative-cycle certificates.


Summary

Concept Definition Key result
Flow value Net flow leaving source Bounded above by every cut capacity
Residual graph Forward and backward residual edges Encodes all ways to reroute flow
Augmenting path $s$-$t$ path in residual graph Flow is max iff none exist
Max-flow min-cut Max flow = min cut capacity Follows from LP strong duality + TU
Ford-Fulkerson Augment via any $s$-$t$ path $O(EF)$ - pseudopolynomial
Edmonds-Karp Augment via BFS shortest path $O(VE^2)$ - polynomial
Bipartite matching Reduce to unit-capacity max-flow Max matching = max flow
König’s theorem Max matching = min vertex cover LP duality on bipartite matching LP
Total unimodularity Incidence matrix has det $\in \{-1, 0, 1\}$ Integer LP optima always achieved
Complementary slackness Optimality condition for primal-dual pairs Explains Ford-Fulkerson termination

Ford and Fulkerson proved their theorem thinking about Soviet supply chains. The same framework now runs in image segmentation software, organ transplant systems, and airline scheduling tools. The reason it appears everywhere is the same: they all involve routing something through a constrained network. What makes it remarkable is that the theorem is not just a combinatorial insight - it is a consequence of LP duality, and LP duality applies wherever there is a linear objective and linear constraints. The supply chain problem in 1956 was really a question about the geometry of linear programs, and the answer - max equals min - is one of the deepest structural facts in optimization.


Read next: