Decidability & The Halting Problem - Questions No Program Can Answer
Helpful context:
In 1936, Turing proved something devastating: no program can exist that takes any program $P$ and any input $X$ as arguments and correctly answers “will $P(X)$ halt, or run forever?”
This is not a limitation of current hardware. It is not a gap in our current mathematics. It is not a problem that a sufficiently advanced civilization could solve. It is mathematically impossible, for any computer that will ever exist, anywhere in the universe, under any physical laws consistent with our mathematics.
This is the Halting Problem, and understanding why it’s undecidable reshapes how you think about the limits of software.
Decidable vs. Recognizable
We need two definitions.
A language $L$ is decidable if there exists a Turing machine that, on every input, eventually halts and either accepts (if the input is in $L$) or rejects (if it isn’t). A decider always gives an answer. No looping.
A language $L$ is Turing-recognizable (or semi-decidable) if there exists a TM that accepts every string in $L$ - but on strings not in $L$, the machine may reject or loop forever. A recognizer might leave you waiting indefinitely for an answer that never comes.
Every decidable language is recognizable (just ignore the distinction between rejecting and looping). But not every recognizable language is decidable.
Examples you already know: sorting is decidable (always halts), graph reachability is decidable, primality testing is decidable. These are all in the large comfortable class of problems that algorithms can fully solve.
The Halting Problem
Define:
$$HALT = \{\langle M, w \rangle : M \text{ is a TM and } M \text{ halts on input } w\}$$
where $\langle M, w \rangle$ means the pair encoded as a string (we know this is possible from the previous post - TMs can be encoded as strings).
$HALT$ is recognizable: just simulate $M$ on $w$, and if it halts, accept. If $M$ loops, the simulation loops - but that’s fine for a recognizer.
$HALT$ is not decidable. No TM can answer “halts or loops?” for every possible program-input pair.
The Diagonalization Proof
Here is the proof. It uses a self-referential argument, similar to the Liar’s Paradox (“this sentence is false”), but made mathematically rigorous.
Assume for contradiction that a TM $H$ decides $HALT$: given any $\langle M, w \rangle$, it always halts and correctly outputs “yes, $M$ halts on $w$” (accepts) or “no, $M$ loops on $w$” (rejects).
Construct a new TM $D$ that works as follows: given input $\langle M \rangle$ (the encoding of any TM $M$):
- Run $H$ on $\langle M, \langle M \rangle \rangle$ (ask $H$: does $M$ halt when given its own description as input?)
- If $H$ accepts (meaning $M$ halts on its own description): loop forever
- If $H$ rejects (meaning $M$ loops on its own description): halt
$D$ does the opposite of what $H$ predicts about $M$ applied to itself.
Now ask: what does $D$ do on input $\langle D \rangle$?
- If $D$ halts on $\langle D \rangle$: then $H$ must accept $\langle D, \langle D \rangle \rangle$ (since $H$ is a decider for HALT and $D$ halts). But step 2 says that when $H$ accepts, $D$ loops forever. Contradiction.
- If $D$ loops on $\langle D \rangle$: then $H$ must reject $\langle D, \langle D \rangle \rangle$ (since $H$ is a decider for HALT and $D$ loops). But step 3 says that when $H$ rejects, $D$ halts. Contradiction.
Either case leads to a contradiction. Therefore the assumption that $H$ exists is false. No TM can decide $HALT$.
Why Diagonalization Works
This is the same technique Georg Cantor used in 1891 to prove the real numbers are uncountable.
Cantor’s argument: suppose you could list all real numbers between 0 and 1 as $r_1, r_2, r_3, \ldots$ Construct a new real number $d$ whose $n$-th decimal digit differs from the $n$-th digit of $r_n$. Then $d$ differs from every $r_n$ at the $n$-th digit - so $d$ is not on the list. Contradiction.
Turing’s argument: suppose you could build a machine $H$ that correctly predicts every machine’s behavior. Construct $D$ to contradict $H$’s prediction about itself. Contradiction.
The abstract pattern: build a new object by explicitly differing from every object in a supposed complete enumeration, using the diagonal position. The new object can’t be in the enumeration - and yet it should be, if the enumeration were truly complete.
The connection to Cantor is not coincidental. The set of all Turing machines is countable (there are countably many finite programs). The set of all possible problems - all subsets of all possible inputs - is uncountable (same cardinality as the real numbers). Therefore most problems are undecidable: the decidable ones are a countable set in an uncountable universe of problems.
Gödel’s Incompleteness Theorems
Five years before Turing, Kurt Gödel proved two theorems (1931) that struck at the foundations of mathematics using the same diagonalization logic. Understanding the connection makes both results clearer.
Setup. A formal system $F$ is a set of axioms and inference rules - a machine for producing proofs. Arithmetic (the Peano axioms, or ZFC set theory) is the prototype. Gödel’s question: can a sufficiently powerful formal system prove every true statement about arithmetic?
Gödel numbering. Gödel’s first insight: encode every symbol, formula, and proof as a natural number. This means statements about arithmetic can refer to other statements - and even to themselves. This is the formal analogue of a program that takes its own code as input.
First Incompleteness Theorem. Any consistent formal system $F$ that can express basic arithmetic contains a sentence $G_F$ that is true (in the standard model of arithmetic) but not provable in $F$.
The construction: $G_F$ is the statement “this sentence is not provable in $F$.” If $G_F$ were provable, $F$ would prove a false statement (since $G_F$ says it’s not provable) - contradicting consistency. If $G_F$ were disprovable, $F$ would prove a false statement (since $G_F$ is in fact true) - again a contradiction. So $G_F$ is neither provable nor disprovable: $F$ is incomplete.
Second Incompleteness Theorem. No consistent formal system $F$ capable of expressing basic arithmetic can prove its own consistency. The statement “$F$ is consistent” is itself a sentence that $F$ cannot prove (assuming $F$ is consistent).
The connection to the halting problem. The two results are formally equivalent. If you could decide the halting problem, you could decide all of mathematics: to check if a statement $\phi$ is provable, run a theorem-prover that enumerates all proofs; it halts iff $\phi$ is provable. Conversely, Gödel’s incompleteness implies there exist programs whose termination cannot be proved from any fixed set of axioms - there are true “this program halts” statements that are unprovable in any consistent system. Both results use the same diagonalization: build a self-referential object (a machine, or a sentence) that does the opposite of what a supposed-complete procedure predicts about it.
The practical upshot: no proof system is both consistent (never proves false things) and complete (proves all true things) - at least, not one powerful enough to reason about arithmetic. This is not a limitation that better axioms can fix. It is a mathematical law.
Reductions: Spreading Undecidability
Once we know $HALT$ is undecidable, we can prove other problems undecidable without inventing new diagonal arguments. The tool is reduction.
A reduction from problem $A$ to problem $B$ is a computable transformation $f$ such that $x \in A \iff f(x) \in B$. If $A$ reduces to $B$ and $A$ is undecidable, then $B$ is undecidable (because if $B$ were decidable, we’d use the reduction to decide $A$).
Example: The emptiness problem for TMs.
$$E_{TM} = \{\langle M \rangle : L(M) = \emptyset\}$$
To prove $E_{TM}$ is undecidable, we reduce $A_{TM} = \{\langle M, w \rangle : M \text{ accepts } w\}$ to $\overline{E_{TM}}$.
Given $\langle M, w \rangle$, construct a new TM $M'$: on input $x$, ignore $x$, simulate $M$ on $w$, accept if $M$ accepts. Now $L(M') = \Sigma^*$ if $M$ accepts $w$, and $L(M') = \emptyset$ if $M$ doesn’t accept $w$. So $M$ accepts $w$ iff $L(M') \neq \emptyset$ iff $\langle M' \rangle \notin E_{TM}$.
If we could decide $E_{TM}$, we could decide $A_{TM}$. Since $A_{TM}$ is undecidable, $E_{TM}$ must be undecidable too.
Rice’s Theorem
Reductions like the above work for almost every interesting question you might ask about programs. This is formalized as Rice’s Theorem.
Rice’s Theorem. Let $P$ be any non-trivial property of Turing-recognizable languages. Then the language $\{\langle M \rangle : L(M) \text{ has property } P\}$ is undecidable.
“Non-trivial” means: some TMs have languages with property $P$, and some don’t. “Property of the language” (not the machine) means: if $L(M_1) = L(M_2)$, then $M_1$ has property $P$ iff $M_2$ does.
Examples of properties Rice’s theorem makes undecidable:
- Does this program accept any input at all?
- Does this program accept all inputs?
- Does this program accept the empty string?
- Does this program terminate on all inputs?
- Does this program compute the same function as some other program?
- Does this program ever print “hello”?
Every one of these is undecidable. Rice’s theorem is a brutal ceiling on static program analysis: any non-trivial semantic question about what a program computes is undecidable.
Discomfort check. If the halting problem is undecidable, how do static analysis tools like linters, type checkers, and program verifiers work? They don’t solve the halting problem - they solve approximations. They either accept false positives (flag some correct programs as potentially problematic), or false negatives (miss some real issues), or restrict to specific program classes where analysis is decidable (like simple loops, bounded integer arithmetic, or typed programs where termination is guaranteed by the type system). Perfect program analysis is impossible. Useful-enough program analysis is entirely possible - and that’s what real tools do.
Summary
| Class | Definition | Closed under complement? | Example |
|---|---|---|---|
| Decidable | TM always halts, accepts iff in $L$ | Yes | Graph reachability, primality |
| Recognizable | TM accepts iff in $L$; may loop on non-members | No | $A_{TM}$, $HALT$ |
| Co-recognizable | Complement is recognizable | No | $\overline{A_{TM}}$, $E_{TM}$ |
| Undecidable, not recognizable | Neither recognizable nor co-recognizable | N/A | $\overline{A_{TM}}$ (not recognizable) |
A language is decidable iff it is both recognizable and co-recognizable. The halting problem is recognizable but not decidable. Its complement is neither.
Read next: