Decidability & The Halting Problem
Prerequisite: Turing Machines
Decidability
A language $L$ is decidable if there exists a Turing machine $M$ that halts on every input and satisfies: $M$ accepts $w$ if and only if $w \in L$. Such a TM is called a decider. Decidable languages form the class sometimes written $\mathbf{R}$ (for recursive).
A language is Turing-recognizable if some TM accepts exactly the strings in it (but may loop on strings outside it). Every decidable language is recognizable; the converse fails.
The central question of this post: which languages are decidable, and which are not? The answer reshapes what we expect from software tools.
The Acceptance Problem
Define:
$$A_{TM} = {\langle M, w \rangle : M \text{ is a TM and } M \text{ accepts } w}$$
Theorem. $A_{TM}$ is Turing-recognizable.
Proof. The Universal Turing Machine $U$ recognizes $A_{TM}$: on input $\langle M, w \rangle$, simulate $M$ on $w$. If $M$ accepts, $U$ accepts. If $M$ rejects, $U$ rejects. If $M$ loops, $U$ loops. So $U$ accepts exactly the pairs where $M$ accepts $w$.
Theorem. $A_{TM}$ is not decidable.
Proof by diagonalization. Suppose for contradiction that TM $H$ decides $A_{TM}$:
$$H(\langle M, w \rangle) = \begin{cases} \text{accept} & \text{if } M \text{ accepts } w \ \text{reject} & \text{if } M \text{ does not accept } w \end{cases}$$
Construct TM $D$: on input $\langle M \rangle$, run $H$ on $\langle M, \langle M \rangle \rangle$ and do the opposite - accept if $H$ rejects, reject if $H$ accepts. Then ask: what does $D$ do on input $\langle D \rangle$?
- If $D$ accepts $\langle D \rangle$, then $H$ accepted $\langle D, \langle D \rangle \rangle$, meaning $D$ accepts $\langle D \rangle$… but then $D$ must reject $\langle D \rangle$ (it does the opposite of $H$). Contradiction.
- If $D$ rejects $\langle D \rangle$, then $H$ rejected $\langle D, \langle D \rangle \rangle$, meaning $D$ does not accept $\langle D \rangle$… but then $D$ must accept $\langle D \rangle$. Contradiction.
Both cases yield contradiction, so $H$ cannot exist. $A_{TM}$ is undecidable.
The Halting Problem
$$HALT_{TM} = {\langle M, w \rangle : M \text{ halts on input } w}$$
Theorem. $HALT_{TM}$ is undecidable.
Proof by reduction from $A_{TM}$. Assume $R$ decides $HALT_{TM}$. Construct a decider $S$ for $A_{TM}$ as follows: on input $\langle M, w \rangle$, first run $R$ on $\langle M, w \rangle$. If $R$ rejects (i.e., $M$ loops on $w$), reject. If $R$ accepts (i.e., $M$ halts on $w$), simulate $M$ on $w$ and accept iff $M$ accepts.
If $R$ decides $HALT_{TM}$, then $S$ decides $A_{TM}$. But $A_{TM}$ is undecidable, so $R$ cannot exist. $HALT_{TM}$ is undecidable.
Many-One Reductions
A polynomial-time many-one reduction (for computability, we use unrestricted mapping reductions) from $A$ to $B$, written $A \leq_m B$, is a computable function $f: \Sigma^\ast \to \Sigma^\ast$ such that:
$$x \in A \iff f(x) \in B$$
Key facts:
- If $A \leq_m B$ and $B$ is decidable, then $A$ is decidable.
- Contrapositive: if $A \leq_m B$ and $A$ is undecidable, then $B$ is undecidable.
- If $A \leq_m B$ and $B$ is recognizable, then $A$ is recognizable.
To show $L$ is undecidable, exhibit a reduction from $A_{TM}$ (or another known undecidable language) to $L$.
More Undecidable Problems
Emptiness problem for TMs. $$E_{TM} = {\langle M \rangle : L(M) = \emptyset}$$
Theorem. $E_{TM}$ is undecidable.
Proof. Reduce $A_{TM}$ to $\overline{E_{TM}}$. Given $\langle M, w \rangle$, construct TM $M'$ that on input $x$: ignores $x$, simulates $M$ on $w$, and accepts if $M$ accepts. Then $L(M') = \Sigma^\ast$ if $M$ accepts $w$, and $L(M') = \emptyset$ otherwise. So $M$ accepts $w$ iff $M' \notin E_{TM}$, i.e., $\langle M' \rangle \notin E_{TM}$. A decider for $E_{TM}$ would decide $A_{TM}$.
Equivalence problem for TMs. $$EQ_{TM} = {\langle M_1, M_2 \rangle : L(M_1) = L(M_2)}$$
$EQ_{TM}$ is neither decidable nor even recognizable (it is not in the arithmetical hierarchy below $\Pi_2^0$).
Rice’s Theorem
The above results are special cases of a sweeping general theorem.
Rice’s Theorem. Let $P$ be any property of Turing-recognizable languages that is non-trivial - meaning some TMs have languages with property $P$ and some do not. Then the language ${\langle M \rangle : L(M) \text{ has property } P}$ is undecidable.
Proof sketch. Suppose $P$ is non-trivial. Let $L_\emptyset$ be the empty language and $L_\text{yes}$ be some language with property $P$ (or vice versa). Given $\langle M, w \rangle$, construct $M'$ that on input $x$: simulates $M$ on $w$ for $|x|$ steps; if $M$ has accepted, simulates a fixed TM $M_\text{yes}$ recognizing $L_\text{yes}$ on $x$. If $M$ accepts $w$, then $L(M') = L_\text{yes}$ (property $P$ holds); if $M$ does not accept $w$, then $L(M') = \emptyset$ (property $P$ fails, assuming $\emptyset$ lacks $P$). Deciding whether $L(M')$ has property $P$ would decide $A_{TM}$.
Rice’s theorem is a brutal limit on static program analysis: any non-trivial question about what a program computes (not how it is written) is undecidable.
Post Correspondence Problem
The Post Correspondence Problem (PCP) asks: given a finite set of dominos ${[t_i / b_i]}$, can we select a sequence of dominos (with repetition) such that the concatenation of the top strings equals the concatenation of the bottom strings?
Theorem. PCP is undecidable.
The proof reduces $A_{TM}$ to PCP by encoding TM computation histories as domino sequences. PCP is important as an intermediate stepping stone: many problems in formal language theory (e.g., ambiguity of CFGs) are shown undecidable by reduction from PCP.
Connection to Gödel’s Incompleteness Theorems
Gödel’s first incompleteness theorem (1931) states: any consistent, sufficiently expressive formal system cannot prove all true statements about arithmetic. This is deeply connected to undecidability.
The connection: the set of provable statements in any such system is Turing-recognizable (enumerate all proofs). If it were decidable, we could decide $A_{TM}$ (Gödel encoded computation into arithmetic). Since $A_{TM}$ is undecidable, the system must be either inconsistent or incomplete. The unprovable statement is essentially a formalized version of the diagonalization: “this statement is not provable.”
Examples
Example 1: Why perfect virus detection is impossible. A virus detector that correctly classifies every program as malicious or benign would solve a non-trivial property of TM languages (does this program’s behavior match a harmful pattern?). By Rice’s theorem, no such detector exists. Real antivirus software uses heuristics and signatures - not decision procedures.
Example 2: Why compilers cannot detect all infinite loops. The halting problem shows no program can correctly determine for all inputs whether an arbitrary program halts. Compilers can warn about obvious loops (constant conditions) but cannot solve the general problem.
Example 3: Undecidability via reduction chain. $A_{TM} \leq_m HALT_{TM} \leq_m E_{TM}^c \leq_m EQ_{TM}^c$. Each link is a mapping reduction: undecidability propagates along the chain.
Read Next: P, NP, and Hardest Problems