Helpful context:


In 1971, Stephen Cook proved that Boolean satisfiability (SAT) is NP-complete. One year later, Richard Karp published a landmark paper showing that 21 natural combinatorial problems - scheduling, graph coloring, packing, covering, partitioning - are all NP-complete, by reducing each one to SAT or to each other.

The conclusion was striking: these problems, which had been studied for decades in operations research, combinatorics, and computer science, were all interconnected in a deep way. A polynomial-time algorithm for any one of them would solve all of them. They are not separate hard problems. They are the same hard problem, in disguise.

This interconnection is built entirely from reductions.


What a Reduction Is

A polynomial-time reduction from problem $A$ to problem $B$ (written $A \leq_p B$) is a polynomial-time computable function $f$ such that:

$$x \in A \iff f(x) \in B$$

Every instance of $A$ maps to an instance of $B$, with the answer preserved. If $x$ is a “yes” instance of $A$, then $f(x)$ is a “yes” instance of $B$. If $x$ is a “no” instance, $f(x)$ is a “no” instance.

The immediate consequence: if $A \leq_p B$ and you can solve $B$ in polynomial time, you can solve $A$ in polynomial time too (transform the $A$ instance, solve $B$, report the answer). Equivalently: if $A \leq_p B$ and $A$ is NP-hard, then $B$ is at least as hard as $A$ - so $B$ is also NP-hard.

Reductions are transitive: if $A \leq_p B$ and $B \leq_p C$, then $A \leq_p C$. This is why the NP-complete class is so tight: once Cook showed SAT is NP-complete, Karp could prove 21 problems NP-complete by finding a chain of reductions from SAT.


Cook’s Theorem: SAT Is NP-Complete

Cook’s theorem is the foundation. SAT is clearly in NP (given an assignment, checking it takes linear time). The hard direction: every NP problem reduces to SAT.

Here’s the idea. Any NP problem $L$ has a polynomial-time verifier $V$: $w \in L$ iff there exists a certificate $c$ such that $V(w, c)$ accepts. The verifier’s computation on $(w, c)$ is a finite sequence of TM configurations, each of polynomial size. We can encode this entire computation as a Boolean formula.

Introduce variables for:

  • The contents of each tape cell at each time step
  • The current state at each time step
  • The head position at each time step

Write clauses enforcing:

  • The starting configuration matches $w$
  • Each step follows the TM’s transition function
  • The final state is the accept state

The resulting formula is satisfiable iff some certificate $c$ makes $V$ accept - iff $w \in L$. Every NP problem encodes into a SAT instance, so SAT is NP-hard. Since it’s also in NP, it’s NP-complete.

This construction is called the tableau proof: the computation history of the TM forms a two-dimensional tableau (time vs. tape position), and each row-to-row transition becomes a set of clauses.


The Direction of Reductions

Discomfort check. Reductions go in a direction that seems backward: to prove problem $X$ is hard, we reduce a known hard problem $A$ to $X$, not $X$ to $A$. Why?

Because we’re trying to show $X$ is hard. If the hard problem $A$ reduces to $X$, it means $X$ can do everything $A$ can do - $X$ is at least as hard as $A$. We’re arguing “if you could solve $X$, you could solve $A$, which is hard to solve, so $X$ must be hard too.” The reduction from $A$ to $X$ transforms any $A$ instance into an $X$ instance, so an $X$-solver also solves $A$.

If we instead reduced $X$ to $A$, we’d be showing $X$ is at most as hard as $A$ - i.e., $X$ is in NP, not that it’s NP-hard.


Karp’s Reductions: The Web of NP-Completeness

Let’s trace the chain from 3-SAT to Independent Set to Vertex Cover.

3-SAT $\leq_p$ Independent Set

Independent Set: given a graph $G$ and integer $k$, does $G$ have $k$ vertices with no edge between any two of them?

Given a 3-CNF formula $\phi$ with $m$ clauses, construct a graph $G$:

  • For each clause $(l_1 \vee l_2 \vee l_3)$, add a triangle (three vertices $l_1, l_2, l_3$, all edges between them)
  • For each pair of vertices labeled with complementary literals (one is $x$, the other is $\neg x$), add an edge between them

Set $k = m$.

Claim. $\phi$ is satisfiable iff $G$ has an independent set of size $m$.

Why. If $\phi$ is satisfiable, pick one true literal per clause. The corresponding $m$ vertices form an independent set: no two are in the same triangle (we took one per clause), and no two are contradictory (all true under the same assignment). Conversely, if there’s an independent set of size $m$, it must pick exactly one vertex from each triangle (three mutually connected vertices, so at most one can be in an independent set, and we need $m$ from $m$ triangles). The selected literals are all consistent (no conflict edges in the set), so setting them all true satisfies every clause.

This reduction runs in polynomial time and preserves the yes/no answer.

Independent Set $\leq_p$ Vertex Cover

Vertex Cover: given a graph $G$ and integer $k$, does $G$ have $k$ vertices such that every edge has at least one endpoint among them?

There’s a beautiful complementary relationship: $S$ is an independent set in $G$ iff $V \setminus S$ is a vertex cover.

Why. $S$ is independent means no edge has both endpoints in $S$, which means every edge has at least one endpoint in $V \setminus S$, which means $V \setminus S$ covers every edge.

So $G$ has an independent set of size $k$ iff $G$ has a vertex cover of size $|V| - k$. The reduction is: take the same graph, change the target from $k$ to $|V| - k$. No construction needed at all.

3-SAT $\leq_p$ 3-Colorability

3-Colorability: can the vertices of $G$ be colored with 3 colors so adjacent vertices get different colors?

This reduction is more intricate. Given a 3-CNF formula $\phi$:

Palette triangle. Add three special vertices: TRUE, FALSE, BASE, all connected to each other. In any valid 3-coloring, these three vertices get distinct colors. We think of the three colors as “true,” “false,” and “base.”

Variable gadgets. For each variable $x_i$, add vertices $x_i$ and $\neg x_i$, connected to each other and both connected to BASE. They can’t get the BASE color (connected to BASE), and they must get different colors from each other (connected to each other), so one gets the TRUE color and one gets the FALSE color. This exactly represents a Boolean assignment.

Clause gadgets. For each clause $(l_1 \vee l_2 \vee l_3)$, add a small gadget of 5 vertices and 7 edges that is 3-colorable iff at least one of $l_1, l_2, l_3$ has the TRUE color.

The full construction is 3-colorable iff $\phi$ is satisfiable - and both directions of the equivalence can be verified carefully.


Proving a New Problem NP-Complete

To show a new problem $\Pi$ is NP-complete, you need two things:

  1. Show $\Pi \in$ NP. Describe a certificate and a polynomial-time verifier. This is usually easy - the certificate is typically the solution itself (the coloring, the tour, the assignment), and verification is just checking it.

  2. Show $\Pi$ is NP-hard. Find a known NP-complete problem $\Lambda$ and give a polynomial-time reduction from $\Lambda$ to $\Pi$. This says: if you can solve $\Pi$, you can solve $\Lambda$, so $\Pi$ is at least as hard as $\Lambda$.

Choosing which known NP-complete problem to reduce from is an art. Some guidelines: 3-SAT is the most flexible source for combinatorial problems. Independent Set or Clique work well for graph problems involving selection. 3-Colorability is good for assignment problems with conflicts. Subset Sum works for numerical problems.


NP-Hardness Without Being in NP

NP-hard means “at least as hard as everything in NP,” but it does not require the problem to be in NP itself.

The Halting Problem is NP-hard. Every problem in NP reduces to it (if you could decide the Halting Problem, you could simulate any NP machine and decide whether it ever reaches an accept state). But the Halting Problem is not in NP - there’s no polynomial-size certificate whose validity you can check in polynomial time.

NP-hard problems that are not NP-complete sit outside NP entirely, often among the undecidable problems.


Summary

From To Key idea
SAT 3-SAT Expand long clauses with fresh variables
3-SAT Independent Set Clause triangles + conflict edges
Independent Set Vertex Cover Complement: $V \setminus S$ covers iff $S$ independent
Independent Set Clique Complement graph: clique in $\bar{G}$ iff independent in $G$
3-SAT 3-Colorability Palette triangle + variable gadgets + clause gadgets
Any NP problem SAT Cook’s tableau construction

All NP-complete problems are reducible to each other. A polynomial-time algorithm for any one of them would give polynomial-time algorithms for all of them, collapsing P and NP.


Read next: