Helpful context:


In 1735, Leonhard Euler was in Königsberg, a city in Prussia built on two islands where the Pregel River forked. The islands and riverbanks were connected by seven bridges. The locals had a recreational puzzle that had never been solved: could you take a walk through the city crossing each of the seven bridges exactly once?

People had tried and failed for years. Euler didn’t just solve the puzzle - he proved it was impossible, and in doing so invented an entirely new branch of mathematics.

His insight was to abstract away everything irrelevant. The physical layout of the city didn’t matter. All that mattered was which landmasses connected to which others, and how many bridges connected them. He replaced the city with dots (landmasses) and lines (bridges). And then he noticed something clean: a walk that crosses every edge exactly once - what we now call an Eulerian path - can only exist if at most two vertices have an odd number of connections. In Königsberg, all four landmasses had an odd number of bridges. No such walk could exist.

That abstraction - replace objects with dots, replace relationships with lines - is the definition of a graph.


Graphs: The Basics

A graph $G = (V, E)$ consists of a set of vertices (or nodes) $V$ and a set of edges $E$ connecting pairs of vertices.

Directed vs. undirected. In an undirected graph, edges have no direction: if $A$ connects to $B$, then $B$ connects to $A$. In a directed graph (digraph), edges are one-way: $(A, B)$ and $(B, A)$ are different edges. Road networks with one-way streets, web links, and dependency graphs are directed.

Weighted vs. unweighted. A weighted graph assigns a numerical value to each edge - distance, cost, capacity. An unweighted graph doesn’t.

Two representations dominate in practice:

Adjacency matrix. A $V \times V$ matrix where entry $(i, j) = 1$ (or the edge weight) if edge $(i, j)$ exists. Edge lookup is $O(1)$, but the matrix always uses $O(V^2)$ space - wasteful for sparse graphs where most entries are zero.

Adjacency list. For each vertex, store only the neighbors that actually exist. Space: $O(V + E)$. Edge lookup is $O(\text{degree}(v))$, but for sparse graphs (where $E \ll V^2$) this is much better. Most real graphs - road networks, social networks, the web - are sparse, so adjacency lists dominate.


BFS explores vertices in order of their distance from the source - first the source itself, then all its neighbors, then their neighbors, and so on. It uses a FIFO queue. Think of it as ripples spreading outward from a stone dropped in water: every node at distance $k$ is fully processed before any node at distance $k+1$.

graph LR subgraph L0["Distance 0"] S end subgraph L1["Distance 1"] A B end subgraph L2["Distance 2"] C D E end S --- A S --- B A --- C A --- D B --- D B --- E
from collections import deque

def bfs(graph, source):
    dist = {source: 0}
    parent = {source: None}
    queue = deque([source])

    while queue:
        u = queue.popleft()
        for v in graph[u]:
            if v not in dist:
                dist[v] = dist[u] + 1
                parent[v] = u
                queue.append(v)

    return dist, parent

# Example
graph = {
    'S': ['A', 'B'],
    'A': ['S', 'C', 'D'],
    'B': ['S', 'D', 'E'],
    'C': ['A'],
    'D': ['A', 'B'],
    'E': ['B'],
}
dist, parent = bfs(graph, 'S')
# dist = {'S':0, 'A':1, 'B':1, 'C':2, 'D':2, 'E':2}

Time complexity. Each vertex is enqueued and dequeued once: $O(V)$. Each edge is examined once: $O(E)$. Total: $O(V + E)$.

What it gives you. After BFS from source $s$, dist[v] is the shortest path (fewest edges) from $s$ to $v$ in an unweighted graph. The parent array encodes the shortest-path tree.

Why a queue and not a stack? A queue enforces “first discovered, first processed.” This maintains the layer structure - you finish processing all vertices at distance $k$ before moving to distance $k+1$. A stack would give DFS instead, and DFS doesn’t give shortest paths.

Bipartiteness testing. A graph is bipartite if its vertices can be 2-colored so that every edge connects vertices of different colors. BFS tests this in $O(V + E)$: assign alternating colors to each layer. If you ever find an edge between two vertices of the same color, the graph isn’t bipartite. This is equivalent to checking that the graph has no odd-length cycles.


DFS dives as deep as possible before backtracking. It uses recursion (which implicitly uses the call stack).

graph LR A["A (d=1, f=8)"] -->|"tree"| B["B (d=2, f=7)"] B -->|"tree"| C["C (d=3, f=6)"] C -->|"tree"| D["D (d=4, f=5)"] A -.->|"forward"| D

DFS from A visits B, then C, then D. D finishes first (f=5), then C (f=6), B (f=7), A (f=8). The dashed edge A→D is a forward edge - D is already fully explored when A tries to visit it.

def dfs(graph):
    visited = set()
    in_stack = set()   # nodes on the current recursion path
    finish_order = []
    has_cycle = [False]

    def dfs_visit(u):
        visited.add(u)
        in_stack.add(u)
        for v in graph[u]:
            if v not in visited:
                dfs_visit(v)
            elif v in in_stack:
                has_cycle[0] = True  # v is an ancestor: cycle found
        in_stack.discard(u)
        finish_order.append(u)       # appended after all descendants finish

    for v in graph:
        if v not in visited:
            dfs_visit(v)

    return finish_order, has_cycle[0]

# Example: directed graph A→B→C→D, A→D
graph = {'A': ['B', 'D'], 'B': ['C'], 'C': ['D'], 'D': []}
finish_order, cycle = dfs(graph)
# finish_order = ['D', 'C', 'B', 'A']  (reversed = topological order)
# cycle = False

A node is appended to finish_order only after every descendant has been fully explored. So nodes with no outgoing edges (leaves, sinks) appear first, and the starting node appears last. Reversing finish_order gives topological order.

Time complexity. Same as BFS: $O(V + E)$.

A directed graph is acyclic (DAG) if and only if DFS finds no back edges - i.e., no neighbor is ever found to be on the current recursion stack.


Topological Sort

A topological ordering of a DAG is a linear order of vertices such that for every directed edge $(u \to v)$, $u$ comes before $v$ in the order. This is the natural order for dependency graphs: compile $A$ before anything that imports $A$, install packages before their dependents.

DFS-based topological sort: Run DFS on the DAG. Reverse finish_order - the last node appended (the one whose entire subtree was explored first) should appear last in the topological order.

Discomfort check. Why does reversing finish_order give topological order? If $(u, v)$ is a directed edge in a DAG, DFS explores $v$ completely and appends it to finish_order before it can return from $u$ and append $u$. So $v$ always appears earlier in finish_order than $u$. Reversing puts $u$ before $v$ - exactly what we want. Walk through a small example: a chain $A \to B \to C$. DFS from $A$ recurses into $B$, then into $C$. $C$ is appended first, then $B$, then $A$. finish_order = [C, B, A]. Reversed: [A, B, C]. Correct.

Time: $O(V + E)$.

Kahn’s algorithm (alternative): Maintain a queue of all vertices with in-degree 0 (nothing depends on them yet). Repeatedly remove a vertex, add it to the output, and decrement the in-degrees of its neighbors. If any neighbor’s in-degree reaches 0, add it to the queue. If the output has fewer than $V$ vertices when the queue empties, a cycle exists.

Here is a concrete dependency graph - think of it as a build system where main imports utils and config, both of which use io, which calls stdlib:

graph LR main --> utils main --> config utils --> io config --> io io --> stdlib

Topological order (one valid answer): main → config → utils → io → stdlib. Every node appears only after all its prerequisites.

from collections import deque, defaultdict

# --- Method 1: DFS-based (uses finish order from dfs() above) ---
def topo_sort_dfs(graph):
    finish_order, has_cycle = dfs(graph)
    if has_cycle:
        raise ValueError("Cycle detected - no topological order exists")
    return finish_order[::-1]  # reverse finish order

# --- Method 2: Kahn's algorithm ---
def topo_sort_kahns(graph):
    in_degree = defaultdict(int)
    for u in graph:
        for v in graph[u]:
            in_degree[v] += 1

    queue = deque(v for v in graph if in_degree[v] == 0)
    order = []

    while queue:
        u = queue.popleft()
        order.append(u)
        for v in graph[u]:
            in_degree[v] -= 1
            if in_degree[v] == 0:
                queue.append(v)

    if len(order) < len(graph):
        raise ValueError("Cycle detected - no topological order exists")
    return order

# Example
build = {
    'main':   ['utils', 'config'],
    'utils':  ['io'],
    'config': ['io'],
    'io':     ['stdlib'],
    'stdlib': [],
}
print(topo_sort_kahns(build))  # ['main', 'utils', 'config', 'io', 'stdlib']

Both methods are $O(V + E)$. Kahn’s has one practical advantage: detecting a cycle is natural (short output) and it processes nodes in a breadth-first wave rather than recursing deeply, avoiding stack overflow on large inputs.


Strongly Connected Components

In a directed graph, a strongly connected component (SCC) is a maximal set of vertices where every vertex can reach every other vertex. Two separate cycles with a one-way bridge between them are two distinct SCCs - you can get from one to the other but not back.

Kosaraju’s algorithm finds all SCCs in $O(V + E)$ using two DFS passes:

  1. Run DFS on $G$; record vertices in finish_order (deepest-explored nodes finish first).
  2. Build $G^T$ - the transpose of $G$, with every edge reversed.
  3. Process vertices from the end of finish_order backwards. For each unvisited vertex, run DFS on $G^T$ - every vertex reachable in this DFS is one SCC.

Concrete example. Two cycles connected by a one-way bridge:

graph LR subgraph SCC1["SCC 1: A, B, C"] A --> B B --> C C --> A end subgraph SCC2["SCC 2: D, E, F"] D --> E E --> F F --> D end C -->|"one-way bridge"| D

Step 1 - DFS on $G$ from A gives finish_order = [F, E, D, C, B, A] (D, E, F finish before C, B, A because they’re fully explored first via the bridge).

Step 2 - Reverse every edge. The bridge C→D becomes D→C in $G^T$:

graph LR subgraph SCC1["SCC 1: A, B, C"] B --> A C --> B A --> C end subgraph SCC2["SCC 2: D, E, F"] E --> D F --> E D --> F end D -->|"reversed bridge"| C

Step 3 - Process finish_order from the back (A first). DFS from A on $G^T$ reaches B and C but hits the reversed bridge D→C going the wrong way - it cannot cross into {D, E, F}. So the first DFS tree collects exactly {A, B, C}. Next unvisited vertex is D: DFS collects {D, E, F}.

The reversed bridge is the key insight: reversing edges keeps SCCs internally connected (a cycle reversed is still a cycle) but flips the direction of cross-SCC bridges, so DFS stays trapped within one SCC at a time.

from collections import defaultdict

def kosaraju(graph):
    # Step 1: DFS on original graph to get finish order
    visited = set()
    finish_order = []

    def dfs1(u):
        visited.add(u)
        for v in graph[u]:
            if v not in visited:
                dfs1(v)
        finish_order.append(u)  # append after all descendants done

    for v in graph:
        if v not in visited:
            dfs1(v)

    # Step 2: Build transpose (reverse all edges)
    transposed = defaultdict(list)
    for u in graph:
        for v in graph[u]:
            transposed[v].append(u)

    # Step 3: DFS on transpose in reverse finish order
    visited.clear()
    sccs = []

    def dfs2(u, component):
        visited.add(u)
        component.append(u)
        for v in transposed[u]:
            if v not in visited:
                dfs2(v, component)

    for u in reversed(finish_order):
        if u not in visited:
            component = []
            dfs2(u, component)
            sccs.append(component)

    return sccs

# Example
graph = {
    'A': ['B'], 'B': ['C'], 'C': ['A', 'D'],
    'D': ['E'], 'E': ['F'], 'F': ['D'],
}
print(kosaraju(graph))  # [['A', 'C', 'B'], ['D', 'F', 'E']]

Proof of correctness. The claim is: each DFS call in step 3 outputs exactly one SCC.

To see why, think about the condensation of $G$ - the DAG formed by collapsing each SCC to a single node and keeping one edge between pairs of SCCs. The condensation is always a DAG (if it had a cycle, all the SCCs in that cycle would merge into one).

Key Lemma. If there is an edge from SCC $C$ to SCC $C'$ in the condensation (some path from $C$ to $C'$ in $G$ but not back), then the last vertex of $C$ to finish has a higher finish time than the last vertex of $C'$ to finish. More precisely, $f(C) > f(C')$ where $f(C) = \max_{v \in C} f(v)$.

Proof. Let $u$ be the first vertex discovered by DFS in either $C$ or $C'$.

  • If $u \in C$: at the moment $u$ is discovered, all vertices of $C'$ are unvisited. Since there is a path from $u$ (in $C$) to $C'$, the DFS subtree rooted at $u$ will eventually reach and fully explore $C'$ before $u$ finishes. Therefore every vertex in $C'$ finishes before $u$ finishes, so $f(C') < f(u) \leq f(C)$.
  • If $u \in C'$: there is no path from $C'$ back to $C$ (different SCCs, edge only goes $C \to C'$). So the DFS subtree rooted at $u$ explores all of $C'$ and never reaches $C$. All of $C'$ finishes before DFS ever visits $C$, giving $f(C') < f(C)$.

In both cases, $f(C) > f(C')$. $\square$

Consequence: the source is processed first. The vertex $u$ with the globally maximum finish time belongs to some SCC $C_1$. If $C_1$ had any incoming edge $C' \to C_1$ in the condensation, the Key Lemma would require $f(C') > f(C_1)$, contradicting $u$ being the global maximum. So $C_1$ has no incoming edges in the condensation of $G$ - it is a source.

Step 3 stays within one SCC. In $G^T$ (all edges reversed), the condensation is also reversed. Since $C_1$ was a source in $G$’s condensation, it becomes a sink in $G^T$’s condensation - no outgoing edges leave $C_1$ in $G^T$. DFS from $u \in C_1$ in $G^T$ can only travel along $G^T$ edges and is therefore confined to $C_1$.

Meanwhile, every vertex in $C_1$ is reachable from $u$ in $G^T$: since $u$ and $v$ are in the same SCC, there is a path from $v$ to $u$ in $G$, and reversing it gives a path from $u$ to $v$ in $G^T$. So DFS from $u$ collects exactly $C_1$.

Induction. After marking $C_1$ as visited, the remaining unvisited vertices form a subgraph. The next unvisited vertex with the highest finish time belongs to the source SCC of that subgraph’s condensation, and the same argument applies. Each DFS call in step 3 identifies exactly one SCC, peeling the condensation from source to sink one layer at a time. $\square$


Connected Components, Bridges, and Articulation Points

In an undirected graph:

Connected components. Run BFS or DFS; each new call to DFS_Visit from the outer loop starts a new connected component.

Articulation points (cut vertices). A vertex whose removal disconnects the graph. Found in $O(V + E)$ by a single DFS: vertex $u$ is an articulation point if it has a child $v$ in the DFS tree with no back edge to any proper ancestor of $u$.

Bridges. An edge whose removal disconnects the graph. Found simultaneously with articulation points using the same DFS.

These structures matter for network reliability: articulation points are single points of failure, bridges are critical links.


When to Use Which

The choice follows from what question you’re actually trying to answer.

Reach for BFS when distance matters:

  • “What is the shortest path (fewest hops) from X to Y?” - BFS gives shortest paths in unweighted graphs
  • “What is the nearest node satisfying some property?” - BFS finds it in minimum steps
  • “Is this graph bipartite / 2-colorable?” - BFS alternates colors layer by layer; a conflict means odd cycle
  • “Process nodes level by level” - BFS’s queue naturally maintains the layer boundary

Reach for DFS when structure matters:

  • “Does this graph contain a cycle?” - a GRAY neighbor during DFS is a back edge, which means a cycle
  • “Is this a DAG?” - no back edges found = DAG
  • “Find all paths from X to Y” - DFS commits to one path at a time and backtracks
  • “Topological ordering of a DAG” - reverse finish_order from DFS
  • “Find strongly connected components” - Kosaraju’s algorithm uses two DFS passes
  • Maze solving, puzzle solving - any “explore a choice, and undo if it fails” pattern maps to DFS

Reach for topological sort when ordering dependencies:

  • “In what order should I execute these tasks?” - build systems, course prerequisites, package installs
  • “Is there a valid schedule at all?” - if Kahn’s empties the queue before outputting all nodes, there is a circular dependency and no valid order exists
  • “What must happen before what?” - topological sort makes the implicit dependency chain explicit

The quick mental test: do you need to know how far something is? → BFS. Do you need to know the full structure or a valid path? → DFS. Do you need an ordering that respects dependencies? → topological sort.


Summary

Algorithm Time Space What it finds
BFS $O(V + E)$ $O(V)$ Shortest paths (unweighted), bipartiteness
DFS $O(V + E)$ $O(V)$ Cycle detection, finish order, DAG check
Topological sort (DFS) $O(V + E)$ $O(V)$ Linear order respecting all directed edges
Topological sort (Kahn’s) $O(V + E)$ $O(V)$ Same, also detects cycles
Kosaraju’s SCCs $O(V + E)$ $O(V)$ All strongly connected components
Articulation points $O(V + E)$ $O(V)$ Vertices whose removal disconnects graph

BFS and DFS are the two fundamental primitives. Almost every other graph algorithm uses one or both as a subroutine. The choice between them comes down to what you need: BFS for shortest paths and level structure, DFS for tree structure, ordering, and cycle detection.

Euler’s insight in 1735 - that the structure of connections matters more than the physical layout - is still the right way to think about computational problems involving relationships. Abstract away the details. Keep only the vertices and edges.


Read next: