Prerequisite:


Graph colouring is one of the oldest and most visually appealing problems in combinatorics, yet it is NP-complete in its general form. The gap between easy special cases (2-colouring in linear time) and hard general cases (3-colouring is NP-complete) is a microcosm of the P vs NP question itself, and the algorithmic story touches approximation theory, structural graph theory, and a remarkable computer-assisted proof.

Formal Definitions

Definition. A proper $k$-colouring of a graph $G = (V, E)$ is a function $c : V \to {1, \ldots, k}$ such that for every edge ${u, v} \in E$, $c(u) \neq c(v)$.

Definition. The chromatic number $\chi(G)$ is the minimum $k$ for which $G$ has a proper $k$-colouring.

Definition. The maximum degree $\Delta(G) = \max_{v \in V} \deg(v)$.

Key structural observations:

  • $\chi(G) \geq \omega(G)$, where $\omega(G)$ is the clique number (size of largest complete subgraph). A $k$-clique requires $k$ distinct colours.
  • $\chi(G) \geq \lceil |V| / \alpha(G) \rceil$, where $\alpha(G)$ is the independence number.
  • The gap between $\omega(G)$ and $\chi(G)$ can be arbitrarily large (Mycielskian construction gives triangle-free graphs with arbitrarily large chromatic number).

2-Colouring: Bipartiteness Testing

Theorem. A graph $G$ is 2-colourable if and only if it is bipartite (contains no odd cycle).

Algorithm. Run BFS or DFS from any vertex. Assign the source colour 1. Assign each newly discovered vertex the opposite colour from its parent. If any edge connects two vertices of the same colour, report “not 2-colourable.”

Complexity. $O(V + E)$ time, $O(V)$ space. This is optimal since every edge must be inspected.

Correctness. In a BFS tree, all tree edges connect vertices at adjacent levels. Non-tree edges (back edges) connect vertices in the same or adjacent levels. An odd cycle exists if and only if a back edge connects vertices at the same level - exactly the same-colour conflict detected by the algorithm.

3-Colouring is NP-Complete

Theorem. 3-COL (decide if $\chi(G) \leq 3$) is NP-complete.

Membership in NP. A colouring $c : V \to {1,2,3}$ is a certificate verifiable in $O(V + E)$ time.

NP-hardness: reduction from 3-SAT. Given a 3-CNF formula $\varphi$ with variables $x_1, \ldots, x_n$ and clauses $C_1, \ldots, C_m$, construct graph $G_\varphi$ as follows.

Variable gadget. For each variable $x_i$, add three nodes: a “true” node $v_i^T$, a “false” node $v_i^F$, and a shared “palette” node $p$ (one global palette node connected to all true/false nodes of all variables). Add edges:

  • $v_i^T - v_i^F$ (the two values must get different colours).
  • $v_i^T - p$ and $v_i^F - p$ (neither value gets the palette colour).

Now the three colours play the roles of TRUE, FALSE, and BASE. The palette node gets the BASE colour; each literal node gets either TRUE or FALSE.

Clause gadget. For clause $C_j = (\ell_1 \vee \ell_2 \vee \ell_3)$, construct a small subgraph (an OR-gadget) connecting the three literal nodes such that the gadget is 3-colourable if and only if at least one of $\ell_1, \ell_2, \ell_3$ is assigned the TRUE colour. The OR-gadget uses 5 additional nodes and 7 additional edges.

Correctness. A satisfying assignment of $\varphi$ gives a proper 3-colouring of $G_\varphi$, and any proper 3-colouring of $G_\varphi$ corresponds to a satisfying assignment. The reduction runs in polynomial time, completing the proof. $\square$

The same reduction shows that 4-COL, 5-COL, … are all NP-complete (since 3-COL reduces to $k$-COL for $k \geq 3$ by adding a $(k-3)$-clique connected to all existing vertices).

Greedy Colouring

Algorithm. Fix an ordering $v_1, v_2, \ldots, v_n$ of the vertices. Process vertices left to right; assign to each $v_i$ the smallest positive integer not used by any already-coloured neighbour.

Theorem. Greedy colouring uses at most $\Delta + 1$ colours.

Proof. When we colour $v_i$, it has at most $\Delta$ already-coloured neighbours, so at most $\Delta$ colours are forbidden. Colour $\Delta + 1$ is always available. $\square$

The bound $\Delta + 1$ is achieved by the greedy algorithm on a clique or odd cycle with the wrong ordering, but is tight in the sense that every graph is greedily $(\Delta+1)$-colourable for some ordering.

Degeneracy ordering. The degeneracy $d(G)$ is the smallest $k$ such that every subgraph of $G$ has a vertex of degree $\leq k$. A degeneracy ordering processes vertices in reverse order of their removal by the peeling algorithm (repeatedly remove a minimum-degree vertex). In this ordering, each vertex has at most $d(G)$ earlier neighbours.

Corollary. Greedy on a degeneracy ordering uses at most $d(G) + 1$ colours.

Since $d(G) \leq \Delta$ and $d(G) \leq \chi(G) - 1$ for many graph families, this can be a significant improvement. For planar graphs, $d \leq 5$, giving a 6-colouring greedily; for forests, $d \leq 1$, giving a 2-colouring.

Brooks' Theorem

The bound $\Delta + 1$ from greedy is tight for complete graphs ($K_n$ requires $n = \Delta + 1$ colours) and odd cycles ($C_{2k+1}$ requires 3 = $\Delta + 1$ colours since $\Delta = 2$). Brooks' theorem says these are the only exceptions.

Theorem (Brooks, 1941). For any connected graph $G$ that is neither a complete graph nor an odd cycle:

$$\chi(G) \leq \Delta(G)$$

Proof sketch. If $G$ is not regular, a BFS ordering from a non-maximum-degree vertex gives $\Delta$-colouring greedily. If $G$ is regular and $\Delta$-connected, a more careful structural argument (finding a specific ordering using non-adjacent neighbours of the root) achieves the bound. The full proof requires a case analysis on connectivity and regularity. $\square$

Brooks' theorem is best possible: $\chi(G)$ can equal $\Delta(G)$ for many graphs (odd cycles, regular complete bipartite graphs). The theorem gives an efficient $\Delta$-colouring algorithm for all non-exceptional graphs.

The Four Colour Theorem

Theorem (Appel-Haken, 1976; Robertson-Sanders-Seymour-Thomas, 1997). Every planar graph satisfies $\chi(G) \leq 4$.

The original proof by Appel and Haken was the first major theorem whose proof relied on computer assistance: they reduced the problem to checking approximately 1,500 unavoidable configurations, each handled by a computer search for a reducible colouring. Robertson et al. gave a cleaner computer-assisted proof in 1997.

The proof structure uses two key concepts:

  • Unavoidability: a set of configurations is unavoidable if every planar graph contains at least one of them.
  • Reducibility: a configuration is reducible if, whenever it appears in a planar graph, a 4-colouring of the graph extends from a 4-colouring of the smaller graph obtained by removing it.

The Five Colour Theorem has an elegant human proof. By Euler’s formula, every planar graph has a vertex $v$ of degree $\leq 5$. Induct: 5-colour $G - v$. If $v$ has fewer than 5 neighbours, or two neighbours share a colour, extend directly. Otherwise, $v$ has five distinctly-coloured neighbours $v_1, \ldots, v_5$ in cyclic order. Run a Kempe chain argument: swap colours 1 and 3 in the component of $G - v$ reachable from $v_1$. Either this frees a colour for $v$, or it makes $v_3$ colour-1, at which point swapping colours 2 and 4 from $v_2$ frees a colour. $\square$

Approximation Hardness

For general graphs, approximating $\chi(G)$ well is also hard.

Theorem (Zuckerman, 2007; building on Hastad). For any $\varepsilon > 0$, it is NP-hard to colour a $k$-colourable graph with fewer than $n^{1-\varepsilon}$ colours.

The best known polynomial-time approximation for general graphs achieves $O(n / \log n)$ colours (Widgerson 1983 gives $O(n / \log \log n)$, later improved). For 3-colourable graphs, achieving 3 colours is NP-hard, and the best known polynomial algorithm uses $O(n^{0.199…})$ colours (Kawarabayashi-Thorup 2014, subsequent improvements).

This approximation hardness contrasts with the perfect algorithmically solvable 2-colouring case, illustrating how even a small increase in the required number of colours creates a computational phase transition.

Examples

Register allocation. In compiler optimisation, each variable has a live range - the set of program points where its value may be needed. Two variables conflict if their live ranges overlap and they cannot share a register. Build the interference graph: one vertex per variable, edge between conflicting pairs. Register allocation is graph $k$-colouring, where $k$ is the number of registers. NP-completeness means optimal allocation is hard in general; heuristic greedy approaches (with spilling for uncolourable vertices) are used in practice.

Exam scheduling. Each exam is a vertex; two exams conflict if a student is enrolled in both. Colour the graph with time slots - no student has two exams in the same slot. Minimising the number of slots is the chromatic number problem. Greedy with degeneracy ordering gives practical solutions; ILP or SAT-based exact methods handle small instances optimally.

Map colouring. The four colour theorem directly guarantees that any political map (modelled as a planar graph, one vertex per region, edges between neighbours) can be coloured with 4 colours so no two adjacent regions share a colour. The original motivation for the problem, posed by Francis Guthrie in 1852, was precisely this cartographic question.

Sudoku as exact cover. A Sudoku puzzle is a precoloured $9 \times 9$ grid where cells are coloured with digits $1$-$9$ subject to row, column, and box constraints. This maps to an exact cover problem (and thence to SAT). The constraint structure is equivalent to a graph colouring problem on a specific graph with $\chi = 9$ for the complete puzzle.


Read Next: