Amortized Analysis - The True Cost of a Sequence // Megha Bose

Helpful context:

Data Structures - Storage Is Strategy

A dynamic array doubles its capacity when full. You’re appending elements one by one. Most appends are instant - you write one value into the next slot and you’re done. But every so often, the array is full and you have to allocate a new, larger block of memory and copy everything over. If the array has $n$ elements, that copying costs $O(n)$.

Does this mean dynamic array append is an $O(n)$ operation?

The honest answer is: yes, sometimes, for that single operation. But if you ask “what’s the total cost of $n$ appends, starting from an empty array?”, the answer is $O(n)$ - linear, not quadratic. The cost of those rare expensive operations gets spread across the many cheap ones.

This is amortized analysis: instead of asking “how expensive can a single operation be?”, ask “what is the average cost per operation across any sequence of operations?” The average is sometimes dramatically lower than the worst-case single-operation cost. And crucially, the amortized bound is exact - not an approximation.

Not the same as average-case analysis. This is a common point of confusion. Average-case analysis asks: “for a random input, what is the expected cost of one operation?” It assumes something about the distribution of inputs. Amortized analysis makes no assumptions about inputs - it is a worst-case guarantee. It asks: “over any sequence of $n$ operations, what is the cost per operation in the worst case?” The “average” in amortized is an average over time (across a sequence of operations), not an average over inputs. Similarly, amortized analysis is not best-case analysis (which describes the most favorable single input) or worst-case single-operation analysis (which describes the most expensive single call). It is a tighter, more informative statement about sequences.

The Core Idea

Amortized analysis is useful when:

Some operations in a data structure are expensive
Those expensive operations are rare
The expensive operations are always preceded by many cheap operations that “set up” the expensive one

The classic examples are dynamic arrays (rare doubling followed by many cheap appends), hash tables (rare rehashing followed by many cheap lookups), and certain tree structures that occasionally restructure themselves.

There are three standard techniques. All three give the same final bound - they differ in what intuition they expose and which is easiest to apply to a given problem.

Method 1: The Aggregate Method

The aggregate method is the most direct. Compute the total cost $T(n)$ of any sequence of $n$ operations, then divide: the amortized cost per operation is $T(n)/n$.

Dynamic array append, analyzed by aggregate.

Start with an empty array of capacity 1. Do $n$ appends. Regular appends (no resizing) cost 1 each. A resize at capacity $k$ copies all $k$ elements, costing $k$.

Resizes happen when the array is full: at sizes 1, 2, 4, 8, 16, …, up to $2^{\lfloor \log n \rfloor}$. The total copying cost is:

$$1 + 2 + 4 + 8 + \cdots + 2^{\lfloor \log n \rfloor} \lt 2n$$

(Geometric series: each term is twice the previous, so the sum is less than twice the last term, which is at most $n$.)

Total cost of $n$ appends: at most $n$ (the regular appends) plus less than $2n$ (the copying) equals less than $3n$. Amortized cost per append: $3n/n = 3 = O(1)$.

The worst-case cost of a single append is $O(n)$. But amortized over a sequence of $n$ appends, each costs $O(1)$.

Method 2: The Accounting Method

The accounting method assigns each operation a “charge” - an amount of credit it pays when it runs. Cheap operations can be charged more than their actual cost, storing the surplus as credit. Expensive operations draw on stored credit. The rule: credit can never go negative (you can’t borrow against the future).

If you can assign charges so that credit is always non-negative, and every operation’s charge is $O(f)$, then the amortized cost per operation is $O(f)$.

Dynamic array, analyzed by accounting.

Charge each append 3 credits: 1 to pay for the append itself, and 2 stored on the new element.

When a resize occurs on an array of capacity $k$ (it was half-full, so $k/2$ elements are being copied):

The last $k/2$ elements inserted (since the previous resize) each stored 2 credits.
Total stored credit: $k/2 \times 2 = k$ credits.
Cost of the resize: copying $k$ elements costs $k$.
Credit balance after resize: $k - k = 0$.

Credit never goes negative. Every append is charged 3 credits = $O(1)$. Therefore the amortized cost per append is $O(1)$.

The accounting method makes explicit the “pre-payment” structure: cheap operations save credit that future expensive operations will consume. When you can make this story precise (each cheap operation saves exactly enough for the next expensive one), the proof falls into place.

Method 3: The Potential Method

The potential method is the most general. Think of it like stored energy in a physical system. Define a potential function $\Phi$ that maps the current state of the data structure to a non-negative number. The potential captures “how much deferred work is waiting.”

The amortized cost of operation $i$ is defined as:

$$\hat c_i = c_i + \Delta\Phi_i = c_i + \Phi(D_i) - \Phi(D_{i-1})$$

where $c_i$ is the actual cost and $D_i$ is the state after operation $i$.

Summing over all $n$ operations:

$$\sum_{i=1}^{n} \hat c_i = \sum_{i=1}^{n} c_i + \Phi(D_n) - \Phi(D_0)$$

So long as $\Phi(D_n) \geq \Phi(D_0)$ (we start at low potential and end at at least as much potential), the total actual cost is at most the total amortized cost. This means: if you can show $\hat c_i = O(f)$ for every operation, the total actual cost is $O(nf)$.

The trick is choosing $\Phi$ so that expensive operations see a drop in potential that absorbs their actual cost, while cheap operations see a slight rise in potential (storing “energy” for later).

Dynamic array, analyzed by potential.

Define $\Phi(D) = 2 \cdot \text{size}(D) - \text{capacity}(D)$.

Right after a resize, size is $k+1$ (the element that triggered it plus all previous) and capacity is $2k$, so $\Phi = 2(k+1) - 2k = 2$. The potential is low, as expected - the array just expanded and has lots of room.

As elements are added, potential grows: when the array is full ($\text{size} = k$, $\text{capacity} = k$), $\Phi = 2k - k = k$. The potential is high, encoding that a resize is imminent.

Ordinary append (no resize): $c_i = 1$, size increases by 1, capacity unchanged, $\Delta\Phi = 2$.

$$\hat c_i = 1 + 2 = 3 = O(1)$$

Resize append: array is full ($\text{size} = k = \text{capacity}$), doubles to $2k$. Cost: copy $k$ elements and do 1 append, so $c_i = k + 1$. After resize: size $= k+1$, capacity $= 2k$, $\Phi$ goes from $k$ to $2(k+1) - 2k = 2$. So $\Delta\Phi = 2 - k$.

$$\hat c_i = (k+1) + (2-k) = 3 = O(1)$$

In both cases the amortized cost is $O(1)$. The potential drops sharply during the resize, absorbing the large actual cost and keeping $\hat c_i$ small.

Discomfort check. “Why not just say worst-case $O(n)$ for append and be done with it?” Because $O(n)$ is catastrophically pessimistic for sequences of appends. If you believe each append costs $O(n)$, you’d estimate $n$ appends costs $O(n^2)$. But we just proved the total is $O(n)$. Using the per-operation worst case as if it applies to every operation overcounts by a factor of $n$. Dynamic arrays would look no better than linked lists, which is wrong - they’re dramatically faster in practice. Amortized analysis gives you the accurate picture.

Application: Splay Trees

Splay trees are binary search trees with no balance information stored at all. Every access - search, insert, delete - rotates the accessed node all the way to the root (this is called “splaying”). Over time, frequently accessed nodes migrate toward the root.

The potential function is:

$$\Phi(T) = \sum_{x \in T} \log(\text{size of subtree rooted at } x)$$

This captures the “disorder” in the tree. The amortized cost of splaying node $x$ to the root is $O(\log n)$. Any sequence of $m$ operations on a splay tree with $n$ elements costs $O((m + n) \log n)$ total - so $O(\log n)$ amortized per operation.

What’s remarkable: splay trees achieve this without storing any balance information at all, and they adapt to access patterns. If you repeatedly access the same element, it stays at the root, making subsequent accesses $O(1)$.

Application: Union-Find

Union-Find supports two operations on a collection of disjoint sets:

Find(x): return the representative (root) of x’s set.
Union(x, y): merge the sets containing x and y.

With two optimizations - path compression (during Find, rewire all nodes on the path to point directly to the root) and union by rank (always attach the smaller tree under the larger one) - both operations have amortized cost $O(\alpha(n))$, where $\alpha$ is the inverse Ackermann function.

The inverse Ackermann function grows so slowly that it’s effectively constant for any practical input size. For $n$ less than the number of atoms in the observable universe, $\alpha(n) \leq 4$. This means Union-Find is essentially $O(1)$ per operation in practice.

The proof is one of the more involved amortized analyses in computer science - the potential function captures the structure of the trees and how path compression changes it. But the punchline is simple: path compression makes future operations dramatically cheaper by flattening the tree, and union by rank keeps the tree shallow enough that path compression doesn’t have to do too much work.

Summary

Method	Core idea	When to use
Aggregate	Sum total cost, divide by $n$	Simple operations where you can count total work
Accounting	Assign credits; cheap ops save for expensive ones	When there’s a natural “pre-payment” story
Potential	Define energy $\Phi$; amortized cost = actual + $\Delta\Phi$	Complex structures; most general; finds the right $\Phi$

The choice of method is a matter of what makes the proof clearest. The potential method is the most general: once you find the right $\Phi$, the algebra is mechanical. Finding the right $\Phi$ is the art - but a good heuristic is that $\Phi$ should be large just before expensive operations and small just after.

Read next:

Divide & Conquer - Small Savings That Compound