Divide and Conquer // Megha Bose

The Paradigm

Divide and conquer solves a problem by (1) dividing it into $a$ subproblems each of size $n/b$, (2) conquering each recursively, and (3) combining the results. The resulting recurrence is:

$$T(n) = aT!\left(\frac{n}{b}\right) + f(n)$$

where $f(n)$ is the cost of dividing and combining. Understanding what this resolves to is the job of the Master Theorem.

The Master Theorem

Let $T(n) = aT(n/b) + f(n)$ with $a \geq 1$, $b > 1$. Define the watershed function $n^{\log_b a}$, which is the total work across all leaves of the recursion tree.

Case 1. If $f(n) = O(n^{\log_b a - \varepsilon})$ for some $\varepsilon > 0$, then the leaves dominate: $$T(n) = \Theta(n^{\log_b a})$$

Case 2. If $f(n) = \Theta(n^{\log_b a} \log^k n)$ for $k \geq 0$, then all levels contribute equally: $$T(n) = \Theta(n^{\log_b a} \log^{k+1} n)$$

Case 3. If $f(n) = \Omega(n^{\log_b a + \varepsilon})$ and $af(n/b) \leq cf(n)$ for some $c < 1$ (regularity condition), then the root dominates: $$T(n) = \Theta(f(n))$$

Proof sketch. Unroll the recursion tree. At depth $d$, there are $a^d$ subproblems of size $n/b^d$, each contributing $f(n/b^d)$ work. Total work is $\sum_{d=0}^{\log_b n} a^d f(n/b^d)$. In Case 1 this sum is dominated by the last term (there are $a^{\log_b n} = n^{\log_b a}$ leaves); in Case 3 by the first term; in Case 2 the terms are equal and there are $\log n$ levels.

Merge Sort

Merge sort is the canonical divide-and-conquer algorithm.

MergeSort(A, lo, hi):
    if lo >= hi: return
    mid = (lo + hi) / 2
    MergeSort(A, lo, mid)
    MergeSort(A, mid+1, hi)
    Merge(A, lo, mid, hi)       // O(n) combine step

Recurrence: $T(n) = 2T(n/2) + O(n)$. By Master Theorem Case 2 ($a=2, b=2$, so $n^{\log_2 2} = n$, $f(n) = \Theta(n)$):

$$T(n) = \Theta(n \log n)$$

The merge step is stable and requires $O(n)$ auxiliary space. Space: $O(n)$.

Closest Pair of Points in $O(n \log n)$

Given $n$ points in the plane, find the pair with minimum Euclidean distance. Naive: $O(n^2)$.

Algorithm:

Sort points by $x$-coordinate. Let $L$ = left half, $R$ = right half, midline $x = x_m$.
Recursively find $\delta_L = \text{closest}(L)$ and $\delta_R = \text{closest}(R)$. Set $\delta = \min(\delta_L, \delta_R)$.
Strip step: Collect all points within $\delta$ of $x_m$. Sort these by $y$. For each point, check at most 7 subsequent points (classical geometric argument: at most 8 points fit in a $2\delta \times \delta$ box).

Recurrence: $T(n) = 2T(n/2) + O(n \log n)$ (due to the $y$-sort in the strip). With pre-sorting, the strip step is $O(n)$, giving $T(n) = 2T(n/2) + O(n) = O(n \log n)$.

Strassen Matrix Multiplication

Naive matrix multiplication of two $n \times n$ matrices costs $O(n^3)$ via 8 recursive multiplications on $n/2 \times n/2$ blocks:

$$T_{\text{naive}}(n) = 8T(n/2) + O(n^2) = O(n^3)$$

Strassen’s insight: algebraically combine the 8 block products into 7 products using additions:

$$T(n) = 7T(n/2) + O(n^2)$$

By Master Theorem Case 1 ($n^{\log_2 7} \approx n^{2.807}$ vs $f(n) = O(n^2)$):

$$T(n) = O(n^{\log_2 7}) \approx O(n^{2.807})$$

The constant factor is large, so Strassen is practical only for $n \gtrsim 500$.

Karatsuba Multiplication

To multiply two $n$-digit integers, write $x = x_1 \cdot B^{n/2} + x_0$ and $y = y_1 \cdot B^{n/2} + y_0$. Naive needs 4 multiplications. Karatsuba uses 3:

$$z_2 = x_1 y_1, \quad z_0 = x_0 y_0, \quad z_1 = (x_1 + x_0)(y_1 + y_0) - z_2 - z_0$$

Then $xy = z_2 B^n + z_1 B^{n/2} + z_0$.

$$T(n) = 3T(n/2) + O(n) \implies T(n) = O(n^{\log_2 3}) \approx O(n^{1.585})$$

This is strictly better than $O(n^2)$ schoolbook multiplication.

FFT as Divide and Conquer

The Discrete Fourier Transform (DFT) of a polynomial $A(x) = \sum_{k=0}^{n-1} a_k x^k$ evaluated at the $n$-th roots of unity $\omega_n^0, \ldots, \omega_n^{n-1}$ costs $O(n^2)$ naively. The Cooley-Tukey FFT splits $A$ into even and odd indexed coefficients:

$$A(x) = A_{\text{even}}(x^2) + x \cdot A_{\text{odd}}(x^2)$$

Each half is an $(n/2)$-point DFT, and combining takes $O(n)$ (the “butterfly” operations):

$$T(n) = 2T(n/2) + O(n) = O(n \log n)$$

Median of Medians (Linear-Time Selection)

To find the $k$-th smallest element in $O(n)$ worst case:

Divide $n$ elements into groups of 5; find median of each group: $O(n)$.
Recursively find median of the $\lceil n/5 \rceil$ medians: $T(n/5)$.
Partition around this “median of medians” pivot. At least $3n/10 - 6$ elements are guaranteed to be on each side.
Recurse on the correct partition: $T(7n/10 + 6)$.

$$T(n) \leq T(n/5) + T(7n/10 + 6) + O(n) = O(n)$$

(Solved by substitution: assume $T(n) \leq cn$, verify constants hold.)

Examples

Inversion counting. Count pairs $(i,j)$ with $i < j$ but $A[i] > A[j]$. Modify merge sort: when merging and an element from the right sub-array is placed before remaining left elements, add the count of remaining left elements to the total. Runs in $O(n \log n)$.

Binary search as trivial D&C. $T(n) = T(n/2) + O(1)$. By Master Theorem Case 2 ($a=1, b=2, n^{\log_2 1} = 1, f = O(1) = \Theta(1)$): $T(n) = O(\log n)$. The “combine” step is empty; all work is in comparing at the midpoint.

Read Next: Dynamic Programming