Complexity Classes
Prerequisite:
Beyond the headline act of P vs NP lies a rich ecosystem of complexity classes, each capturing a different notion of what makes a problem tractable. Understanding this landscape - the polynomial hierarchy, counting classes, randomised classes, and interactive proofs - reveals structural relationships that are surprising, deep, and practically important.
The Base Classes: P, NP, co-NP
Definition.
- $P$: languages decidable by a deterministic TM in polynomial time.
- $NP$: languages decidable by a nondeterministic TM in polynomial time, equivalently, languages with polynomial-time verifiable certificates for YES instances.
- $\text{co-}NP$: complements of NP languages - those with polynomial-time verifiable certificates for NO instances.
The central open question is whether $P = NP$. A secondary open question is whether $NP = \text{co-}NP$. If $NP \neq \text{co-}NP$, then certainly $P \neq NP$, since $P$ is closed under complement.
The class $NP \cap \text{co-}NP$. A language in this intersection has short certificates for both YES and NO answers. Integer factorisation lies here: given $N$ and $k$, does $N$ have a factor $\leq k$? A factor itself certifies YES; a complete prime factorisation (verifiable via AKS in polynomial time) certifies NO. Discrete logarithm is similarly believed to be in $NP \cap \text{co-}NP$ but not in $P$.
The Polynomial Hierarchy
The polynomial hierarchy (PH) extends P and NP by stacking oracle access to NP.
Definition. Define $\Sigma_0^P = \Pi_0^P = P$, and for $k \geq 1$:
$$\Sigma_k^P = NP^{\Sigma_{k-1}^P}, \qquad \Pi_k^P = \text{co-}\Sigma_k^P$$
That is, $\Sigma_k^P$ consists of languages decidable nondeterministically in polynomial time with access to a $\Sigma_{k-1}^P$ oracle.
The first few levels:
- $\Sigma_1^P = NP$, $\Pi_1^P = \text{co-}NP$.
- $\Sigma_2^P = NP^{NP}$: problems like “does there exist a formula that is not equivalent to any formula in a given set?” A machine guesses a witness, then queries an NP oracle to verify a property of that witness.
- $\Pi_2^P = \text{co-}NP^{NP}$: e.g. “is every satisfying assignment of $\varphi$ also a satisfying assignment of $\psi$?”
The polynomial hierarchy:
$$PH = \bigcup_{k \geq 0} \Sigma_k^P$$
Collapse theorem. If $\Sigma_k^P = \Pi_k^P$ for any $k$, then $PH = \Sigma_k^P$ - the hierarchy collapses to the $k$-th level. In particular, if $P = NP$ (i.e. $\Sigma_1^P = \Pi_1^P$), then $PH = P$. The widely held belief that the hierarchy is strict (does not collapse) implies $P \neq NP$.
Counting Complexity: #P
NP asks whether at least one solution exists. What if we want to count all solutions?
Definition. A function $f : {0,1}^\ast \to \mathbb{N}$ is in $#P$ if there exists a polynomial-time NTM $M$ such that $f(x)$ equals the number of accepting computation paths of $M$ on $x$.
Examples.
- $#\text{SAT}$: count the number of satisfying assignments of a CNF formula.
- $#\text{MATCH}$: count the number of perfect matchings in a bipartite graph (this is the permanent of the 0-1 biadjacency matrix).
- $#3\text{-COL}$: count proper 3-colourings of a graph.
Valiant’s theorem (1979). $#\text{SAT}$ is $#P$-complete under polynomial-time Turing reductions. Moreover, even $#\text{MATCH}$ is $#P$-complete, despite the matching decision problem being in $P$.
This illustrates a striking phenomenon: a decision problem can be easy while its counting version is $#P$-hard. Knowing a matching exists (easy) is very different from counting how many exist.
Toda’s Theorem. The polynomial hierarchy is contained in polynomial time with a $#P$ oracle:
$$PH \subseteq P^{#P}$$
This means counting is strictly more powerful than the entire alternating-quantifier hierarchy, at least relative to polynomial-time reductions.
Randomised Complexity Classes
What if the machine can flip fair coins?
Definition.
- $BPP$ (Bounded-error Probabilistic Polynomial time): languages decidable by a probabilistic TM in polynomial time with error probability $\leq 1/3$ on all inputs (both YES and NO instances).
- $RP$ (Randomised Polynomial time): error $\leq 1/2$ on YES instances; always correct on NO instances (one-sided error).
- $\text{co-}RP$: always correct on YES instances; error $\leq 1/2$ on NO instances.
- $ZPP = RP \cap \text{co-}RP$: zero-error expected polynomial time (Las Vegas algorithms).
Amplification. The error threshold $1/3$ in BPP is not special: by running $k$ independent trials and taking majority vote, error drops to $2^{-\Omega(k)}$. This means BPP is robust to the exact constant.
Containment. $P \subseteq ZPP \subseteq RP \subseteq BPP$ and $RP \subseteq NP$, $\text{co-}RP \subseteq \text{co-}NP$. Moreover, $BPP \subseteq \Sigma_2^P \cap \Pi_2^P$ (Sipser-Gács-Lautemann theorem), so BPP sits in the second level of the polynomial hierarchy.
The conjecture $BPP = P$. Randomness is widely believed not to add power for decision problems. Pseudorandom generators based on one-way functions would suffice to derandomise BPP, and under strong circuit lower bounds (Nisan-Wigderson generators), $BPP = P$ follows. Practically, most known BPP algorithms have been derandomised.
Interactive Proofs and IP = PSPACE
Classical proof systems have a prover sending a single proof and a verifier checking it. What if the verifier can ask questions?
Definition. $IP$ is the class of languages with an interactive proof system: a (computationally unbounded) prover $P$ and a polynomial-time verifier $V$ exchange polynomially many messages. After the interaction, $V$ accepts or rejects. Requirements:
- Completeness. For $x \in L$, $P$ can make $V$ accept with probability $\geq 2/3$.
- Soundness. For $x \notin L$, no prover strategy makes $V$ accept with probability $> 1/3$.
Theorem (Shamir, 1992). $IP = \text{PSPACE}$.
The upper bound ($IP \subseteq \text{PSPACE}$) holds because the prover strategy can be computed optimally in polynomial space. The surprising direction is $\text{PSPACE} \subseteq IP$: the canonical PSPACE-complete problem TQBF has an interactive proof via the arithmetisation technique, where Boolean formulas are converted to low-degree polynomials and the verifier uses the Schwartz-Zippel lemma to check evaluations.
The theorem implies that $\text{co-NP} \subseteq IP$, since $\text{co-}NP \subseteq \text{PSPACE}$. This is non-trivial: it shows that graph non-isomorphism has an interactive proof, via a protocol where the prover answers which of two randomly permuted graphs matches a query graph.
PCPs and Approximation Hardness
Theorem (PCP Theorem, 1992). $NP = PCP(O(\log n), O(1))$.
Every NP language has a probabilistically checkable proof (PCP) that uses $O(\log n)$ random bits and reads $O(1)$ bits of the proof, with completeness 1 and soundness $\leq 1/2$.
The PCP theorem has transformative implications for inapproximability. For example, it implies that for MAX-3-SAT, no polynomial-time algorithm achieves approximation ratio better than $7/8$ unless $P = NP$ (Hastad). This connects the structure of NP proofs to the limits of efficient approximation.
Quantum Complexity: QMA
Definition. $QMA$ (Quantum Merlin-Arthur) is the quantum analogue of NP: languages where YES instances have a polynomial-size quantum state (witness) that a quantum polynomial-time verifier accepts with probability $\geq 2/3$, and NO instances have no such witness accepted with probability $> 1/3$.
QMA captures the hardness of quantum verification problems. The $k$-local Hamiltonian problem (deciding whether the ground state energy of a $k$-local Hamiltonian is below a threshold) is QMA-complete - the quantum analogue of SAT.
Ladner’s Theorem: NP-Intermediate Problems
If $P \neq NP$, the NP landscape is not simply $P$ plus NP-complete problems.
Theorem (Ladner, 1975). If $P \neq NP$, there exist languages in $NP \setminus P$ that are not NP-complete. These are called NP-intermediate problems.
The proof constructs a diagonalisation: build a language that is in NP, not in P, but reduces to SAT only slowly enough to stay below NP-completeness. The language is artificial, but the theorem motivates the search for natural NP-intermediate candidates.
Examples
Graph isomorphism. Given graphs $G$ and $H$, are they isomorphic? GI is in NP (a bijection is a witness) and in co-AM (an Arthur-Merlin protocol for non-isomorphism exists). GI is not known to be in P or NP-complete, making it a leading candidate for a natural NP-intermediate problem. Babai’s 2015 quasipolynomial-time algorithm placed GI in the complexity no-man’s-land.
Factoring and #P. While factoring integers is in $NP \cap \text{co-}NP$, counting the number of distinct prime factors of $N$ is related to $#P$. The difficulty of counting (as opposed to deciding) explains in part why factoring resists classical algorithms: the arithmetic structure that makes factoring hard also makes the counting of divisors computationally opaque.
Read Next: