Turing Machines // Megha Bose

Prerequisite: Automata Theory & FSMs | Context-Free Grammars & Parsing

The Formal Model

A Turing machine (TM) is the canonical model of computation. Unlike finite automata or pushdown automata, a TM has an infinite read/write tape and can move in both directions, making it powerful enough to simulate any algorithm.

Formally, a TM is a 7-tuple $M = (Q, \Sigma, \Gamma, \delta, q_0, q_\text{accept}, q_\text{reject})$ where:

$Q$ is a finite set of states
$\Sigma$ is the input alphabet (not containing the blank symbol $\sqcup$)
$\Gamma \supseteq \Sigma \cup {\sqcup}$ is the tape alphabet
$\delta: Q \setminus {q_\text{accept}, q_\text{reject}} \times \Gamma \to Q \times \Gamma \times {L, R}$ is the transition function
$q_0 \in Q$ is the start state
$q_\text{accept} \in Q$ is the accept state
$q_\text{reject} \in Q$ is the reject state, $q_\text{reject} \neq q_\text{accept}$

A transition $\delta(q, a) = (r, b, D)$ means: in state $q$ reading symbol $a$, write $b$, move the head in direction $D \in {L, R}$, and enter state $r$.

Configurations and Computation

A configuration of a TM captures its complete instantaneous description: the current state, the entire tape content, and the head position. We write a configuration as $u, q, v$ where $q$ is the current state, $u$ is the tape content to the left of the head, and $v$ is the tape content from the head rightward (with trailing blanks omitted).

Configuration $C_1$ yields $C_2$ if the TM can legally move from $C_1$ to $C_2$ in one step. The computation of $M$ on input $w$ is the sequence of configurations $C_0, C_1, C_2, \ldots$ where $C_0 = q_0 w$ is the start configuration.

A TM accepts $w$ if this sequence reaches $q_\text{accept}$, rejects if it reaches $q_\text{reject}$, and loops if it runs forever. The language recognized by $M$ is $L(M) = {w : M \text{ accepts } w}$.

Example: Deciding ${0^n 1^n : n \geq 0}$

A TM deciding this language uses the following strategy: repeatedly scan left to right, crossing off one $0$ and one $1$ per pass.

If the tape is blank (or all crossed off), accept.
Scan right for the first $0$. If none found (only $1$s remain), reject.
Cross off (overwrite with $X$) this $0$. Continue right, skipping $0$s and $X$s.
Find the first uncrossed $1$. If none found, reject. Cross it off.
Return head to leftmost position. Go to step 1.

Each pass crosses off one $0$-$1$ pair. After $n$ passes, all symbols are crossed, and the TM accepts. If the counts were unequal, the TM rejects. This runs in $O(n^2)$ steps (one $O(n)$ pass per pair).

Variants of Turing Machines

Multi-tape TMs. A $k$-tape TM has $k$ tapes, each with its own head. The transition function takes a $k$-tuple of symbols and produces a $k$-tuple of symbols and directions. Any $k$-tape TM can be simulated by a single-tape TM with at most an $O(t^2)$ overhead (store all $k$ tapes on one tape, separated by delimiters; scan the entire tape to simulate one step). With a more careful simulation, the overhead is $O(t \log t)$.

Nondeterministic TMs. A nondeterministic TM (NTM) has $\delta: Q \times \Gamma \to \mathcal{P}(Q \times \Gamma \times {L,R})$. The NTM accepts if any computation branch accepts. NTMs do not recognize more languages than deterministic TMs, but the time difference may be exponential. Simulating an NTM deterministically: use breadth-first search over the computation tree.

Random access machines. An idealized model of real computers with RAM can simulate TMs in polynomial overhead, establishing that TMs are not artificially weak compared to real hardware.

The Universal Turing Machine

A Universal Turing Machine (UTM) is a TM $U$ that takes as input a pair $\langle M, w \rangle$ - an encoding of TM $M$ and string $w$ - and simulates $M$ on $w$.

To encode a TM, number its states and alphabet symbols; represent the transition table as a string. This encoding is $O(|Q||\Gamma|)$ in length. The UTM $U$ then:

Stores $\langle M \rangle$ on tape 1, the simulated tape on tape 2, and $M$’s current state on tape 3.
At each step, scans tape 1 for the matching transition, updates tape 2 and tape 3.
Accepts or rejects when $M$ would.

The overhead of the UTM over $M$’s own runtime is $O(t \log t)$ using an efficient encoding. The existence of a UTM is the foundational insight behind the stored-program computer: hardware and software are interchangeable. Every modern computer is a physical realization of a UTM.

Decidable vs. Recognizable Languages

$L$ is Turing-recognizable (recursively enumerable) if some TM accepts every string in $L$ (and either rejects or loops on strings not in $L$).
$L$ is decidable (recursive) if some TM halts on all inputs and accepts exactly $L$.

Every decidable language is recognizable, but not vice versa. The complement of every recognizable language is co-recognizable; a language is decidable if and only if it is both recognizable and co-recognizable.

Closure properties. Decidable languages are closed under union, intersection, concatenation, complement, and Kleene star. Recognizable languages are closed under union and intersection but not complement.

The Church-Turing Thesis

The Church-Turing thesis asserts that the informal notion of “effectively computable function” coincides exactly with the class of functions computable by Turing machines. This is not a mathematical theorem (it cannot be proved, since “effectively computable” is informal), but it is supported by the following evidence:

Every other formal model of computation - lambda calculus, recursive functions, register machines, cellular automata, quantum computers - computes exactly the same class of functions as TMs (up to efficiency for quantum).
No one has ever exhibited an intuitively computable function that is not TM-computable.
Every programming language is Turing complete: given enough memory, any program can be simulated by a TM and vice versa.

A consequence of the thesis: to show a problem is undecidable, it suffices to show it cannot be solved by any TM - no need to analyze every conceivable computing device.

Encoding TMs and Diagonalization

TMs can be encoded as binary strings. Since the set of binary strings is countable and TMs can only compute over countable inputs, we can list all TMs: $M_1, M_2, M_3, \ldots$ and all binary strings $s_1, s_2, s_3, \ldots$

The diagonalization language is $L_d = {s_i : M_i \text{ does not accept } s_i}$. We claim $L_d$ is not recognizable. Suppose TM $M_j$ recognizes $L_d$. Does $M_j$ accept $s_j$?

If yes, then $s_j \in L(M_j) = L_d$, so by definition of $L_d$, $M_j$ does not accept $s_j$. Contradiction.
If no, then $s_j \notin L(M_j)$, but $s_j \in L_d$ by definition, so $M_j$ should accept $s_j$. Contradiction.

Therefore no such $M_j$ exists, and $L_d$ is not recognizable. This construction is the foundation of undecidability proofs.

Examples

Example 1: Multi-tape efficiency. Sorting on a two-tape TM takes $O(n \log n)$ steps by mergesort, whereas sorting on a single-tape TM requires $\Omega(n^2)$ steps in the worst case.

Example 2: Every programming language is Turing complete. Python, C, Java, Haskell - each can simulate any TM (given sufficient memory), and each can be simulated by a TM. This is why programming language designers do not worry about computability: all reasonable languages compute the same functions.

Example 3: Enumerators. A TM with a “print” capability (an enumerator) generates a language by printing strings one by one. A language is recognizable if and only if some enumerator generates it - providing another characterization of the class.

Read Next: Decidability & The Halting Problem