Inclusion-Exclusion Principle
Prerequisite:
The inclusion-exclusion principle is a counting technique that corrects for overcounting when sets overlap. It appears throughout combinatorics, probability, and number theory, and is the engine behind the famous derangement formula and Euler’s totient function.
The Two-Set Case
The simplest version of inclusion-exclusion handles two overlapping sets.
Theorem. For any two finite sets $A$ and $B$,
$$|A \cup B| = |A| + |B| - |A \cap B|.$$
Proof. Every element of $A \cup B$ falls into exactly one of three disjoint regions: $A \setminus B$, $B \setminus A$, and $A \cap B$. Counting by region:
$$|A \cup B| = |A \setminus B| + |B \setminus A| + |A \cap B|.$$
Since $|A| = |A \setminus B| + |A \cap B|$ and $|B| = |B \setminus A| + |A \cap B|$, adding these gives $|A| + |B| = |A \setminus B| + |B \setminus A| + 2|A \cap B|$. Substituting:
$$|A| + |B| - |A \cap B| = |A \setminus B| + |B \setminus A| + |A \cap B| = |A \cup B|. \quad \square$$
The Three-Set Case
With three sets the Venn diagram has seven regions:
+-------+ +-------+
| A\BC | | B\AC |
| +--+---+--+ |
| |A∩B|A∩B∩C| |
+----+--+---+--+----+
| A∩C | B∩C |
+--+--+--+--+
| C\AB |
+------+
Theorem. For finite sets $A$, $B$, $C$:
$$|A \cup B \cup C| = |A|+|B|+|C| - |A \cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C|.$$
The logic is the same: adding the individual sizes counts the two-way intersections twice and the triple intersection three times, so we subtract pairwise intersections (removing the triple intersection too many times) and add the triple intersection back.
The General Formula
Definition. For a collection of finite sets $A_1, A_2, \ldots, A_n$, let $[n] = {1,2,\ldots,n}$. The inclusion-exclusion principle states:
$$\left|\bigcup_{i=1}^n A_i\right| = \sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|+1} \left|\bigcap_{i \in S} A_i\right|.$$
Expanding: sum over all singletons (add), all pairs (subtract), all triples (add), and so on.
Proof via indicator functions. For each element $x$ in the universe, let $I_i(x) = 1$ if $x \in A_i$ and $0$ otherwise. Define $k = \sum_{i=1}^n I_i(x)$, the number of sets containing $x$.
The right-hand side counts $x$’s contribution as
$$\sum_{\emptyset \neq S \subseteq [n]} (-1)^{|S|+1} \prod_{i \in S} I_i(x).$$
This equals $\sum_{j=1}^{k} (-1)^{j+1} \binom{k}{j}$ because there are $\binom{k}{j}$ subsets of size $j$ among the $k$ sets containing $x$.
If $k = 0$, the sum is $0 = \mathbf{1}[x \in \bigcup A_i]$. If $k \geq 1$:
$$\sum_{j=1}^{k} (-1)^{j+1}\binom{k}{j} = -\sum_{j=1}^{k}(-1)^{j}\binom{k}{j} = -\left(\sum_{j=0}^{k}(-1)^j\binom{k}{j} - \binom{k}{0}\right) = -(0-1) = 1.$$
So each $x \in \bigcup A_i$ contributes exactly $1$ to the right-hand side, and $x \notin \bigcup A_i$ contributes $0$. Summing over all $x$ gives $|\bigcup_{i=1}^n A_i|$. $\square$
Proof by Induction (Sketch)
Base case $n=2$ was proved above. Assuming the formula holds for $n-1$ sets, write $\bigcup_{i=1}^n A_i = \left(\bigcup_{i=1}^{n-1} A_i\right) \cup A_n$ and apply the two-set formula. Expand $\left|\left(\bigcup_{i=1}^{n-1} A_i\right) \cap A_n\right| = \left|\bigcup_{i=1}^{n-1}(A_i \cap A_n)\right|$ using the inductive hypothesis. The signs work out to yield the general formula.
Derangements
A derangement of ${1, 2, \ldots, n}$ is a permutation $\sigma$ such that $\sigma(i) \neq i$ for every $i$. Let $D_n$ denote the number of derangements.
Let $A_i$ be the set of permutations fixing position $i$. By inclusion-exclusion:
$$|A_1 \cup \cdots \cup A_n| = \sum_{k=1}^{n} (-1)^{k+1}\binom{n}{k}(n-k)!$$
because there are $\binom{n}{k}$ ways to choose $k$ fixed points and $(n-k)!$ ways to permute the rest. The number of permutations with at least one fixed point is this sum, so:
$$D_n = n! - |A_1 \cup \cdots \cup A_n| = \sum_{k=0}^{n}(-1)^k \frac{n!}{k!}.$$
As $n \to \infty$, $D_n / n! \to e^{-1} \approx 0.368$, so roughly $1/e$ of all permutations are derangements.
Small Values
| $n$ | $D_n$ |
|---|---|
| 1 | 0 |
| 2 | 1 |
| 3 | 2 |
| 4 | 9 |
| 5 | 44 |
The recurrence $D_n = (n-1)(D_{n-1} + D_{n-2})$ follows from a direct argument: if 1 goes to position $k \neq 1$, then either $k$ goes to position 1 (the pair swap, $D_{n-2}$ completions) or $k$ does not go to position 1 (equivalent to deranging $n-1$ elements).
Counting Surjective Functions
The number of surjective (onto) functions from an $n$-element set to a $k$-element set is given by inclusion-exclusion. Let $B_j$ be the set of functions that miss element $j$ in the codomain. A function is surjective iff it avoids every $B_j$:
$$\text{Surj}(n,k) = \sum_{j=0}^{k}(-1)^j \binom{k}{j}(k-j)^n.$$
This is because there are $\binom{k}{j}$ ways to choose $j$ missed elements and $(k-j)^n$ functions into the remaining $k-j$ elements.
The Stirling numbers of the second kind satisfy $k!, S(n,k) = \text{Surj}(n,k)$.
Examples
Password counting. How many 8-character passwords over ${a\text{-}z, 0\text{-}9}$ (36 characters) contain at least one digit and at least one letter?
Let $A$ = passwords with no digits, $B$ = passwords with no letters. Then:
$$|A| = 26^8, \quad |B| = 10^8, \quad |A \cap B| = 0.$$
By inclusion-exclusion, passwords missing a digit or a letter: $|A \cup B| = 26^8 + 10^8$. Passwords with both: $36^8 - 26^8 - 10^8 \approx 2.82 \times 10^{12}$.
Derangements in practice. In a class of 30 students, exams are shuffled and returned at random. The expected number of students who get their own exam back is $30 \cdot (1/30) = 1$ by linearity of expectation. The probability that nobody gets their own exam is $D_{30}/30! \approx e^{-1} \approx 36.8%$.
Forbidden patterns. How many integers from 1 to 1000 are divisible by at least one of 2, 3, or 5?
$$|A_2| = 500, ; |A_3| = 333, ; |A_5| = 200, ; |A_2 \cap A_3| = 166, ; |A_2 \cap A_5| = 100, ; |A_3 \cap A_5| = 66, ; |A_2 \cap A_3 \cap A_5| = 33.$$
By inclusion-exclusion: $500 + 333 + 200 - 166 - 100 - 66 + 33 = 734$.
Summary
The inclusion-exclusion principle corrects for overcounting by alternately adding and subtracting intersection sizes. Its power lies in converting a complex union into manageable intersection counts. Key applications include:
- Derangements: $D_n = n! \sum_{k=0}^{n} (-1)^k / k!$
- Surjective functions: $\text{Surj}(n,k) = \sum_{j=0}^{k}(-1)^j\binom{k}{j}(k-j)^n$
- Euler’s totient: $\phi(n) = n \prod_{p | n}(1 - 1/p)$ derived by inclusion-exclusion over prime divisors
- Probability: $P(A_1 \cup \cdots \cup A_n)$ follows the same alternating-sign formula
Read Next: