Pigeonhole Principle - Too Many Objects, Too Few Boxes
Helpful context:
- Counting & Choice - How to Count Without Counting Everything
- Proof Techniques - The Toolkit for Mathematical Certainty
The pigeonhole principle is, at first glance, embarrassingly obvious: if you have more objects than boxes, some box must hold more than one object. You could state it to a child and they would nod along. So why does it appear at the heart of some of the deepest results in combinatorics, number theory, and theoretical computer science? The answer is that the principle itself is trivial - the craft lies entirely in recognizing what the “pigeons” and “holes” should be in a given situation. This requires genuine mathematical insight, and getting it right is what separates routine problems from elegant proofs.
The principle generates results across an enormous range of scales. At one end, it resolves elementary puzzles about socks in a drawer. At the other end, it implies that any lossless compression algorithm must make some inputs longer (an information-theoretic limit that is independent of technology), that any 2-coloring of $K_6$ must contain a monochromatic triangle (the first nontrivial Ramsey number), and that for any continuous function from the 2-sphere to the plane, some pair of antipodal points must be mapped to the same value (Borsuk-Ulam). The underlying mechanism is always the same counting argument, but the identification of the right structure is the real work.
The Statement and Its Generalization
Theorem (Pigeonhole Principle). If $n+1$ objects are distributed among $n$ containers, at least one container holds at least 2 objects.
Proof. Suppose for contradiction that every container holds at most 1 object. Then the total number of objects is at most $n \cdot 1 = n$. But we have $n+1$ objects, a contradiction. $\square$
The proof is a single line. The content of the principle is entirely in its applications.
Theorem (Generalized Pigeonhole Principle). If $nm + 1$ objects are distributed among $n$ containers, at least one container holds at least $m+1$ objects.
Proof. If every container held at most $m$ objects, the total would be at most $nm$, contradicting the assumption that there are $nm+1$ objects. $\square$
Equivalently: if no container holds more than $m$ objects, the total is at most $nm$. The contrapositive: more than $nm$ objects forces at least one container to hold more than $m$.
A useful corollary: if $n$ objects are distributed among $k$ containers, some container holds at least $\lceil n/k \rceil$ objects. This ceiling form appears constantly in algorithm analysis and extremal combinatorics.
Classic Applications
Birth months. Among any 13 people, two share a birth month. The pigeons are the 13 people, the holes are the 12 months. By pigeonhole, some month contains at least 2 people.
Points on a sphere. Among any 5 points on a unit sphere, two are within distance 1 of each other. Divide the surface of the sphere into 4 regions by two great circles. Among 5 points, by pigeonhole two lie in the same region. With appropriate regions of diameter less than 1, those two points are within distance 1.
Remainders mod $n$. Among any $n+1$ integers, two have the same remainder when divided by $n$. The holes are the $n$ possible remainders $\{0, 1, \ldots, n-1\}$, and the pigeons are the $n+1$ integers. If $a$ and $b$ share a remainder, then $n \mid (a - b)$.
This last fact connects to the birthday paradox. The paradox asks: how many people must be in a room before the probability that some two share a birthday exceeds 50%? The answer (around 23) surprises people because they conflate “same birthday as me” with “same birthday as anyone.” The pigeonhole principle gives the deterministic version: with 366 people, two must share a birthday. The probabilistic question is subtler, but the structural observation is the same - the number of pairs grows quadratically in the number of people, making collisions far more likely than naive intuition suggests.
The Erdős-Szekeres Theorem
One of the most striking applications of pigeonhole to sequences:
Theorem (Erdős-Szekeres, 1935). Any sequence of more than $(r-1)(s-1)$ distinct integers contains either an increasing subsequence of length $r$ or a decreasing subsequence of length $s$.
Proof. Let the sequence be $a_1, a_2, \ldots, a_N$ where $N > (r-1)(s-1)$. For each index $i$, define:
- $\alpha_i$ = length of the longest increasing subsequence ending at $a_i$
- $\beta_i$ = length of the longest decreasing subsequence ending at $a_i$
Assign to each $a_i$ the pair $(\alpha_i, \beta_i)$.
If any $\alpha_i \geq r$, we have an increasing subsequence of length $r$ and are done. If any $\beta_i \geq s$, we have a decreasing subsequence of length $s$ and are done. Otherwise, every pair $(\alpha_i, \beta_i)$ lies in $\{1, \ldots, r-1\} \times \{1, \ldots, s-1\}$, giving at most $(r-1)(s-1)$ possible pairs.
But we have $N > (r-1)(s-1)$ indices, so by pigeonhole two indices $i < j$ satisfy $(\alpha_i, \beta_i) = (\alpha_j, \beta_j)$. Since all integers are distinct, either $a_i < a_j$ or $a_i > a_j$.
- If $a_i < a_j$: any increasing subsequence ending at $a_i$ extends to one ending at $a_j$, so $\alpha_j \geq \alpha_i + 1 > \alpha_i$. Contradiction.
- If $a_i > a_j$: any decreasing subsequence ending at $a_i$ extends to one ending at $a_j$, so $\beta_j \geq \beta_i + 1 > \beta_i$. Contradiction.
Both cases are contradictions, so our assumption that $\alpha_i < r$ and $\beta_i < s$ for all $i$ must fail. $\square$
The bound $(r-1)(s-1)$ is tight: arrange $r-1$ decreasing blocks each of length $s-1$ to produce a sequence of $(r-1)(s-1)$ terms with no increasing subsequence of length $r$ and no decreasing subsequence of length $s$.
Ramsey Theory: The Impossibility of Complete Disorder
Ramsey theory asks: in any sufficiently large combinatorial structure, some regular pattern must appear. The pigeonhole principle is the foundational tool.
Theorem. $R(3,3) = 6$: any 2-coloring of the edges of $K_6$ (using red and blue) contains a monochromatic triangle.
Proof. Pick any vertex $v$. It has 5 edges. By pigeonhole, at least $\lceil 5/2 \rceil = 3$ of these edges share a color - say red - connecting $v$ to vertices $u_1, u_2, u_3$.
Now examine the 3 edges among $\{u_1, u_2, u_3\}$.
- If any edge $u_i u_j$ is red: then $v, u_i, u_j$ form a red triangle. Done.
- If no edge among $\{u_1, u_2, u_3\}$ is red: then all three edges are blue, so $u_1, u_2, u_3$ form a blue triangle. Done.
In either case, a monochromatic triangle exists. $\square$
The bound is tight: $K_5$ can be 2-colored without any monochromatic triangle. A Ramsey number $R(r,s)$ is the minimum $n$ such that every 2-coloring of $K_n$ contains a red $K_r$ or blue $K_s$. It is known that $R(4,4) = 18$; the value of $R(5,5)$ remains unknown after decades of effort, known only to lie in $[43, 48]$. Erdős famously remarked: if an alien threatened to destroy Earth unless we computed $R(5,5)$, we should mobilize all mathematicians to work on it. But if they asked for $R(6,6)$, we should try to destroy the alien first.
The Information-Theoretic Limit on Compression
Theorem. No lossless compression algorithm can shorten every input.
More precisely: there is no injective function $f: \{0,1\}^n \to \{0,1\}^{<n}$ mapping $n$-bit strings to strings strictly shorter than $n$ bits.
Proof. The number of binary strings of length strictly less than $n$ is
$$\sum_{k=0}^{n-1} 2^k = 2^n - 1.$$
There are $2^n$ strings of length exactly $n$. By pigeonhole, any function from a set of size $2^n$ to a set of size $2^n - 1$ must fail to be injective: at least two distinct $n$-bit strings map to the same shorter string. Lossless decompression requires injectivity, so no such compression scheme exists. $\square$
This is not a statement about any particular algorithm - it is a fundamental limit on what information-theoretic compression can achieve, independent of computational power. Any lossless scheme that shrinks some files must expand others. This is why compressing an already-compressed file tends to make it slightly larger: the first compression has already pushed the file toward the informational minimum.
The Continuous Version: Borsuk-Ulam
The continuous analog of the pigeonhole principle is the Borsuk-Ulam theorem.
Theorem (Borsuk-Ulam). For any continuous function $f: S^n \to \mathbb{R}^n$, there exists a point $x \in S^n$ such that $f(x) = f(-x)$.
Here $S^n$ is the $n$-sphere (surface of the unit ball in $\mathbb{R}^{n+1}$) and $-x$ is the antipodal point.
The $n = 1$ case says any continuous function from the circle to the line has two antipodal points with the same value. Define $g(x) = f(x) - f(-x)$. Then $g(-x) = f(-x) - f(x) = -g(x)$, so $g$ changes sign somewhere (intermediate value theorem), meaning $g(x_0) = 0$, i.e., $f(x_0) = f(-x_0)$.
The $n = 2$ case has a vivid physical interpretation: at any moment, there exist two antipodal points on the surface of the Earth with exactly the same temperature and the same barometric pressure. The pair (temperature, pressure) is a continuous function from the Earth’s surface (a 2-sphere) to $\mathbb{R}^2$, and Borsuk-Ulam guarantees an antipodal pair with identical values.
The proof of the general case requires algebraic topology - specifically the nontriviality of the $n$-th cohomology group of $S^n$ - but the moral is the same as the finite pigeonhole principle: the domain is “too rich” relative to the target, and some collision is unavoidable.
Summary
| Application | Pigeons | Holes | Conclusion |
|---|---|---|---|
| 13 people, 12 months | people | months | two share a month |
| $n+1$ integers mod $n$ | remainders | $\{0, \ldots, n-1\}$ | two are congruent |
| Sequence of $(r-1)(s-1)+1$ | pairs $(\alpha_i, \beta_i)$ | $(r-1)(s-1)$ pairs | monotone subsequence exists |
| $K_6$ 2-coloring | vertex’s 5 edges | 2 colors | monochromatic triangle |
| $2^n$ inputs, compressed | inputs | $< 2^n$ shorter strings | two inputs collide |
| $f: S^2 \to \mathbb{R}^2$ continuous | antipodal pairs | codomain values | antipodal collision |
The pattern is always the same: count the pigeons, count the holes, observe that there are more of the former. The mathematics is in choosing the right objects to count.
Read next: