Automata Theory & FSMs // Megha Bose

Prerequisite: Think Like a Programmer

What Is a Finite Automaton?

A finite automaton is the simplest model of computation: a machine with a fixed, finite amount of memory. Despite their simplicity, finite automata underlie lexical analysis in every compiler, regular expression engines, and network packet filters. Understanding them precisely is the foundation for understanding the limits of computation.

Formally, a deterministic finite automaton (DFA) is a 5-tuple $M = (Q, \Sigma, \delta, q_0, F)$ where:

$Q$ is a finite set of states
$\Sigma$ is a finite input alphabet
$\delta: Q \times \Sigma \to Q$ is the transition function
$q_0 \in Q$ is the start state
$F \subseteq Q$ is the set of accept states

The machine reads an input string $w = w_1 w_2 \cdots w_n$ one symbol at a time. Starting in state $q_0$, after reading $w_i$ it moves to state $\delta(q, w_i)$ where $q$ is the current state. We extend $\delta$ to strings by defining $\hat{\delta}(q, \varepsilon) = q$ and $\hat{\delta}(q, wa) = \delta(\hat{\delta}(q, w), a)$. The language recognized by $M$ is:

$$L(M) = {w \in \Sigma^\ast : \hat{\delta}(q_0, w) \in F}$$

Nondeterministic Finite Automata

A nondeterministic finite automaton (NFA) relaxes the transition function to $\delta: Q \times \Sigma_\varepsilon \to \mathcal{P}(Q)$, where $\Sigma_\varepsilon = \Sigma \cup {\varepsilon}$ and $\mathcal{P}(Q)$ is the power set of $Q$. At each step the machine can move to any subset of states, and it also allows $\varepsilon$-transitions that consume no input. The NFA accepts if any computation path reaches an accept state.

This nondeterminism does not add computational power. The subset construction converts any NFA with $n$ states into an equivalent DFA.

Subset construction. Given NFA $N = (Q, \Sigma, \delta, q_0, F)$, define DFA $M = (Q', \Sigma, \delta', q_0', F')$ where:

$Q' = \mathcal{P}(Q)$ (each DFA state is a set of NFA states)
$q_0' = \varepsilon\text{-closure}({q_0})$
$\delta'(R, a) = \varepsilon\text{-closure}!\left(\bigcup_{r \in R} \delta(r, a)\right)$
$F' = {R \in Q' : R \cap F \neq \emptyset}$

The worst case produces $2^n$ DFA states, and this is tight: there exist NFAs for which every equivalent DFA requires exponentially many states.

Regular Languages

The class of languages recognized by DFAs (equivalently, NFAs) is called the regular languages. This class is closed under the following operations, each provable by NFA construction:

Union. If $L_1, L_2$ are regular, so is $L_1 \cup L_2$. Proof: create a new start state with $\varepsilon$-transitions to the start states of both automata.

Concatenation. If $L_1, L_2$ are regular, so is $L_1 \circ L_2$. Proof: add $\varepsilon$-transitions from every accept state of the first NFA to the start state of the second.

Kleene star. If $L$ is regular, so is $L^\ast$. Proof: add a new accept start state with an $\varepsilon$-transition to the original start state, and add $\varepsilon$-transitions from all accept states back to the original start state.

Complement and intersection also preserve regularity, since DFAs can be complemented by swapping accept and non-accept states, and intersection follows from De Morgan’s law.

Kleene’s theorem states that the regular languages are exactly the languages described by regular expressions, where regular expressions are built from $\emptyset$, $\varepsilon$, single symbols, union $+$, concatenation, and star $*$.

The Pumping Lemma

To prove a language is not regular, we use the Pumping Lemma.

Theorem. If $L$ is a regular language, then there exists a pumping length $p \geq 1$ such that any string $w \in L$ with $|w| \geq p$ can be written as $w = xyz$ satisfying:

$|y| \geq 1$
$|xy| \leq p$
$xy^i z \in L$ for all $i \geq 0$

Proof sketch. Let $p = |Q|$. For any $w$ with $|w| \geq p$, the DFA visits at least $p+1$ states while processing $w$, so by the pigeonhole principle some state $q_j$ repeats. Let $y$ be the portion of $w$ that drives the DFA around this loop. Pumping $y$ any number of times traverses the same loop, staying in $L$.

Example: ${0^n 1^n : n \geq 0}$ is not regular.

Suppose for contradiction it is regular with pumping length $p$. Take $w = 0^p 1^p \in L$. By the lemma, $w = xyz$ with $|xy| \leq p$ and $|y| \geq 1$. Since $|xy| \leq p$, the substring $y$ lies entirely within the block of $0$s, so $y = 0^k$ for some $k \geq 1$. Then $xy^2z = 0^{p+k} 1^p \notin L$ since $p + k \neq p$. Contradiction.

The Myhill-Nerode Theorem

The Pumping Lemma is a necessary condition for regularity. The Myhill-Nerode theorem gives a necessary and sufficient condition, and also explains DFA minimization.

For a language $L \subseteq \Sigma^\ast$, define the equivalence relation $\equiv_L$ on $\Sigma^\ast$ by:

$$x \equiv_L y \iff \forall z \in \Sigma^\ast,\ xz \in L \iff yz \in L$$

Two strings are $\equiv_L$-equivalent if no continuation distinguishes them with respect to $L$.

Myhill-Nerode Theorem. $L$ is regular if and only if $\equiv_L$ has finitely many equivalence classes. Moreover, the number of classes equals the number of states in the minimum DFA for $L$.

This gives both a non-regularity proof technique (exhibit infinitely many pairwise distinguishable strings) and a minimization algorithm: identify states in the DFA corresponding to the same $\equiv_L$ class and merge them.

Minimization procedure. Start by partitioning states into accept and non-accept groups. Iteratively refine: two states are in the same class only if they have the same transition target class for every symbol. Repeat until no refinement is possible. This runs in $O(n \log n)$ time with the Hopcroft algorithm.

Examples

Example 1: DFA for strings over ${0,1}$ ending in $01$. States: $q_0$ (start, last symbol unknown or string empty), $q_1$ (last symbol was $0$), $q_2$ (last two symbols were $01$, accept). Transitions: $\delta(q_0, 0) = q_1$, $\delta(q_0, 1) = q_0$, $\delta(q_1, 0) = q_1$, $\delta(q_1, 1) = q_2$, $\delta(q_2, 0) = q_1$, $\delta(q_2, 1) = q_0$. Accept state: ${q_2}$.

Example 2: Proving ${ww : w \in {0,1}^\ast}$ is not regular. For each $n$, consider the strings $0^n$. For distinct $i \neq j$, the string $0^i$ is distinguishable from $0^j$ by the continuation $z = 0^i$: $0^i \cdot 0^i = 0^{2i} = ww$ with $w = 0^i$ is in $L$, but $0^j \cdot 0^i$ has odd length $i + j$ if $i \neq j$ which may or may not be in $L$ - a careful argument using distinguishing suffixes shows all $0^i$ are pairwise $\equiv_L$-inequivalent. Since infinitely many classes exist, $L$ is not regular by Myhill-Nerode.

Applications

Finite automata are not just theoretical constructs:

Lexical analysis. Every compiler’s tokenizer is a DFA. Regular expressions for keywords, identifiers, and literals are compiled to a single DFA via the subset construction for fast scanning.
Network packet filtering. Firewall rules and intrusion detection systems use DFAs to match byte patterns in network streams at line speed.
String searching. The Knuth-Morris-Pratt algorithm is essentially a DFA construction for a pattern string, achieving $O(n)$ search in $O(m)$ preprocessing.
Protocol verification. Communication protocols are modeled as automata, enabling formal verification of correctness properties.

Read Next: Context-Free Grammars & Parsing