Complexity Classes - Sorting Problems by How Hard They Are // Megha Bose

Helpful context:

Computer scientists have catalogued hundreds of complexity classes. P, NP, coNP, BPP, ZPP, RP, #P, PP, PSPACE, EXPTIME, PH, IP, PPAD - the list keeps growing. This isn’t taxonomy for its own sake. Each class captures a different kind of resource or computational capability. Understanding where your problem sits in this landscape tells you what tools are appropriate and, crucially, what limitations are fundamental.

The classes we’ll cover here go beyond P and NP. They involve randomness, alternating quantifiers, counting, and interaction - each axis revealing a different dimension of hardness.

A Map of the Territory

Complexity classes are defined by two choices: a resource bound (time, space, randomness, number of solutions) and a machine type (deterministic, nondeterministic, probabilistic, alternating). The most important containments we know:

$$L \subseteq NL \subseteq P \subseteq NP \subseteq \text{PSPACE} \subseteq \text{EXPTIME}$$

Every one of these is believed strict. We can prove $L \neq \text{PSPACE}$ and $P \neq \text{EXPTIME}$. Everything in the middle is open.

The Polynomial Hierarchy

NP lets you guess one witness and verify it. What if the verification itself requires guessing? What if you need to check: does there exist an $x$ such that for all $y$, there exists a $z$ satisfying some condition?

That nested quantifier structure is captured by the polynomial hierarchy (PH). Define $\Sigma_0^P = P$ and for $k \geq 1$:

$$\Sigma_k^P = NP^{\Sigma_{k-1}^P}, \qquad \Pi_k^P = \text{co-}\Sigma_k^P$$

Read $NP^{\Sigma_{k-1}^P}$ as: nondeterministic polynomial time with free access to a $\Sigma_{k-1}^P$ oracle. The first few levels:

$\Sigma_1^P = NP$, $\Pi_1^P = \text{co-}NP$
$\Sigma_2^P = NP^{NP}$: guess a witness, then call an NP oracle to verify a property of it
$\Pi_2^P = \text{co-}NP^{NP}$: universal quantifier at the top, NP oracle inside

The full hierarchy:

$$PH = \bigcup_{k \geq 0} \Sigma_k^P$$

PH lives inside PSPACE. The key structural fact: if the hierarchy collapses at any level - meaning $\Sigma_k^P = \Sigma_{k+1}^P$ for some $k$ - then $PH = \Sigma_k^P$. In particular, $P = NP$ would collapse PH all the way to $P$.

Most complexity theorists believe PH doesn’t collapse. If PH is infinite, that alone implies $P \neq NP$. The contrapositive is a useful intuition pump: if you ever prove $P = NP$, you’ve also collapsed the entire polynomial hierarchy.

Randomized Classes

What if the machine can flip fair coins?

BPP (Bounded-error Probabilistic Polynomial time): a language is in BPP if there is a randomized polynomial-time algorithm that is correct with probability at least $2/3$ on every input. The $2/3$ threshold isn’t special - run $k$ independent trials and take majority vote; the error drops to $2^{-\Omega(k)}$. BPP is robust.

RP (Randomized Polynomial time): one-sided error. Always correct on NO instances; correct on YES instances with probability at least $1/2$. Useful when false positives are catastrophic.

co-RP: the mirror. Always correct on YES; error $\leq 1/2$ on NO.

ZPP $= RP \cap \text{co-}RP$: zero error, expected polynomial time. Las Vegas algorithms.

The containment chain: $P \subseteq ZPP \subseteq RP \subseteq BPP$, and $RP \subseteq NP$. Also $BPP \subseteq \Sigma_2^P \cap \Pi_2^P$ - BPP sits in the second level of the polynomial hierarchy.

The derandomization conjecture: $BPP = P$. Randomness is widely believed not to add power for decision problems. Under plausible circuit lower bounds (the Nisan-Wigderson generator framework), $BPP = P$ follows. Most known BPP algorithms have eventually been derandomized. Primality testing lived in RP for decades before the AKS algorithm (2002) put it in P.

Counting: #P

NP asks whether a solution exists. What if you want to count all solutions?

#P is a class of functions, not languages. A function $f$ is in #P if there is a polynomial-time nondeterministic machine such that $f(x)$ equals the number of its accepting computation paths on input $x$.

Examples:

$#\text{SAT}$: count the satisfying assignments of a CNF formula
$#\text{MATCH}$: count perfect matchings in a bipartite graph (this is the permanent of a 0-1 matrix)
$#\text{3-COL}$: count proper 3-colorings of a graph

Valiant’s theorem (1979). $#\text{SAT}$ is $#P$-complete. More strikingly, $#\text{MATCH}$ is also $#P$-complete - even though deciding whether a perfect matching exists is in $P$.

This is one of the sharpest separations in complexity theory: a decision problem can be easy while its counting version is as hard as counting satisfying assignments of arbitrary Boolean formulas. Knowing a solution exists tells you almost nothing about how many exist.

Toda’s theorem. The entire polynomial hierarchy is contained in polynomial time with a #P oracle:

$$PH \subseteq P^{#P}.$$

Counting is more powerful than alternating quantifiers. If you could count solutions exactly and efficiently, you could solve everything in NP and co-NP and $\Sigma_2^P$ and higher - all at once.

Interactive Proofs and IP = PSPACE

Classical proof systems have a prover submit a proof, and a verifier check it. Interactive proofs let them have a conversation.

Definition. $IP$ is the class of languages with an interactive proof system: an all-powerful prover and a polynomial-time probabilistic verifier exchange polynomially many messages. Requirements:

Completeness: for yes-instances, the honest prover makes the verifier accept with probability $\geq 2/3$.
Soundness: for no-instances, no prover strategy makes the verifier accept with probability $> 1/3$.

Theorem (Shamir, 1992). $IP = PSPACE$.

This is astonishing. Interactive proofs are exactly as powerful as polynomial space. The prover can convince the verifier of any fact in PSPACE - including things like “this quantified Boolean formula is false” - using a polynomial-length conversation.

The proof technique is arithmetization: turn Boolean formulas into polynomials over a field. The verifier asks the prover to evaluate these polynomials at random points; the Schwartz-Zippel lemma ensures that a lying prover is caught with high probability.

One consequence: graph non-isomorphism is in IP. The verifier randomly permutes one of two graphs and asks the prover to identify which one it came from. An honest prover (who can distinguish the graphs) always answers correctly; a prover trying to fake isomorphism answers correctly only half the time. Ten rounds reduce the soundness error below $2^{-10}$.

Zero-Knowledge Proofs

Interactive proofs let you prove something. Zero-knowledge proofs let you prove something without revealing why it’s true.

Consider 3-coloring: you claim to know a valid 3-coloring of a graph, and you want to prove it to a skeptic without revealing the coloring itself. Protocol: randomly permute your color names. Cover all vertices. The verifier picks a random edge. You reveal the two endpoints' colors. They’re different. Repeat for many rounds.

After $|E|$ rounds, the verifier is convinced with high probability that a valid coloring exists. But they learn nothing about the actual coloring - every round shows two distinct colors from a random permutation, which is information-theoretically uninformative about the original assignment.

Zero-knowledge proofs are used in cryptography, blockchain systems, and privacy-preserving authentication. They rely on computational assumptions, but the proof-of-concept works for any NP problem.

PPAD and Nash Equilibria

Some problems are guaranteed to have solutions - not by algorithm, but by mathematics. Fixed-point theorems (Brouwer, Kakutani) guarantee Nash equilibria exist for every finite game. But finding them is hard.

PPAD captures this kind of problem: a total search problem where a solution provably exists but finding it is computationally difficult. PPAD stands for “Polynomial Parity Argument on a Directed graph” - it exploits the fact that in certain graphs, every source has a corresponding sink.

Theorem. Computing a Nash equilibrium of a 2-player game is PPAD-complete.

PPAD sits between P and NP but is not known to be NP-complete. This is significant: the problem always has a solution (so it’s not an NP-complete decision problem in the usual sense), yet finding it appears to require exponential time in the worst case.

The practical implication: economists can’t efficiently compute market equilibria. This isn’t just a limitation of current algorithms - PPAD-completeness suggests the hardness is fundamental.

Discomfort check. With hundreds of complexity classes, most apparently distinct from P and NP, does any of this matter practically?

Yes, significantly. PPAD-completeness of Nash equilibria explains why large-scale market simulations can’t find exact equilibria. #P-hardness of counting explains why computing the partition function in statistical mechanics is hard, and why estimating network reliability is hard. IP = PSPACE means interactive protocols in cryptography have deep theoretical roots. The landscape matters when you want to know: is there a polynomial-time algorithm, or is the hardness provably fundamental? The answer often depends on which class your problem belongs to, not just whether it’s in NP.

Summary

Class	Definition	Complete problem	Key fact
$P$	Poly-time, deterministic	Circuit value problem	Closed under complement
$NP$	Poly-time verifiable certificate	SAT	$NP \subseteq \text{PSPACE}$
$\text{co-}NP$	Complement of NP	Tautology	$P = NP \Leftrightarrow NP = \text{co-}NP$? Open
$\Sigma_2^P$	$NP^{NP}$	$\exists \forall$ SAT	Collapses if $P = NP$
$PH$	$\bigcup_k \Sigma_k^P$	-	Lives inside PSPACE
$BPP$	Randomized, bounded error	-	Conjectured $= P$
$#P$	Counting NP witnesses	$#\text{SAT}$, permanent	$PH \subseteq P^{#P}$
$IP$	Interactive proofs	-	$= \text{PSPACE}$ (Shamir)
$PPAD$	Total search, fixed-point type	Nash equilibrium	Between P and NP
$\text{PSPACE}$	Poly-space	TQBF	$= IP$, contains all of PH

The deeper lesson: complexity is not a binary hard/easy distinction. It’s a landscape, with different kinds of hardness corresponding to different computational resources and proof strategies. P vs NP is the most famous open question, but the landscape around it is already rich with answers.

Read next:

Graph Coloring - Assigning Colors So No Two Neighbors Agree