Prerequisite: How to Think Like a Programmer


A phase appears in almost every discipline where the union of it with mathematics is essential. It’s not surprising that algorithmic thinking eventually incorporates mathematics for greater sophistication, solving problems more effectively, and analyzing them. However, there’s a misconception that almost everything about an algorithm is tied to mathematics. Just like physics, chemistry, economics, or any other subject, algorithms have a huge qualitative aspect as well. An algorithm just needs to be a logical sequence of actions to perform a task.

Take, for example, frying potatoes. You buy them, clean them, peel them, cut them, and then fry them in the pan. There’s no quantitative aspect in the sense of exact measurements, but in reality, to give it some actual meaning, we have to include considerations such as how many potatoes are needed, their cost, the amount of water for cleaning, the size of cut pieces, the amount of oil to use, the pan’s size, the frying duration and so on. While slight changes in these values may not matter much in this case, if someone fries 2 potatoes in a liter of oil, that is surely a path to hell. Mathematics comes to the rescue by introducing tools to make better sense of the quantitative part. This allows flexibility in tweaking values and quantities, which in this case, can give us more creative freedom and enable the preparation of various dishes like shallow-fried potatoes, deep-fried potatoes, french fries, or chips by adjusting the amount of oil or the shape and size of cut parts.

Thinking quantitatively proves beneficial in the long run. While using precise values might seem excessive in certain cases, as we tackle more complex real-world problems, precision and understanding quantities becomes crucial. It could involve significant resource implications, especially when applied on a large scale with billions of humans in the loop, or even be a matter of life or death. A byproduct of being able to quantify things is the ability to compare. When dealing with real quantities, the superpower of comparison aids in decision-making - allowing us to know what to select or reject, what to focus on, what is important, tweak values, and understand dependencies between different entities and their quantities deeply to make better algorithms.

So quantitative thinking gets us to the point where we can compare algorithms - but how do we define the metric by which we decide which algorithm is better?

First, we need to ensure the algorithm is correct. Then we look at how efficient it is. Consider the task of multiplying two numbers. One method is repetitive addition; another is the method we all learned in first grade - multiply one number with each digit of the other, shift, and add. The second method is clearly better, but why? It minimizes computational operations (relating to time) and eases the cognitive load of memorizing intermediate values (relating to memory). We want algorithms that optimize both time and memory requirements. But how do we quantify these?

Enter complexity theory - a mathematical framework for analyzing algorithmic efficiency.

Asymptotic Notation

We describe how an algorithm’s resource usage scales with input size $n$ using asymptotic notation:

Big-O - $O(g(n))$ (loose upper bound)

$f(n) = O(g(n))$ if there exist positive constants $c$ and $n_0$ such that $0 \leq f(n) \leq c \cdot g(n)$ for all $n \geq n_0$. Intuitively, $f$ grows no faster than $g$ in the long run.

Little-o - $o(g(n))$ (strict upper bound)

$f(n) = o(g(n))$ if for all positive constants $c$, there exists $n_0$ such that $0 \leq f(n) < c \cdot g(n)$ for all $n \geq n_0$. The growth rate of $f$ is strictly less than that of $g$. Note that $o(n)$ implies $O(n)$, but not vice-versa - $\log n$ is both $o(n)$ and $O(n)$, but $n$ is only $O(n)$.

Big-Omega - $\Omega(g(n))$ (loose lower bound)

$f(n) = \Omega(g(n))$ if there exist positive constants $c$ and $n_0$ such that $0 \leq c \cdot g(n) \leq f(n)$ for all $n \geq n_0$. Intuitively, $f$ grows at least as fast as $g$.

Little-omega - $\omega(g(n))$ (strict lower bound)

$f(n) = \omega(g(n))$ if for all positive constants $c$, there exists $n_0$ such that $0 \leq c \cdot g(n) < f(n)$ for all $n \geq n_0$. The growth rate of $f$ is strictly greater than $g$. Similarly, $\omega(n)$ implies $\Omega(n)$, but not vice-versa - $n^2$ is both $\omega(n)$ and $\Omega(n)$, but $n$ is only $\Omega(n)$.

Big-Theta - $\Theta(g(n))$ (tight/exact bound)

$f(n) = \Theta(g(n))$ if it is both $O(g(n))$ and $\Omega(g(n))$. This pins down the exact growth rate. Note that there is no “small theta” - an algorithm cannot simultaneously be $o(g(n))$ and $\omega(g(n))$, since no function can be strictly above and strictly below $g(n)$ at the same time.

Reading the Notations Intuitively

If your algorithm runs in:

  • $O(n)$ - it is upper-bounded by some linear growth, but may be faster. A loose ceiling.
  • $o(n)$ - it is strictly faster than any linear algorithm; e.g. $O(\log n)$.
  • $\Omega(n)$ - it takes at least linear time; a loose floor.
  • $\omega(n)$ - it is strictly slower than linear; e.g. $\Omega(n^2)$.
  • $\Theta(n)$ - it grows exactly linearly; the tight characterization.

These notations are the standard vocabulary for analyzing how an algorithm’s performance scales - and they are why a single well-chosen algorithm can mean the difference between a system that handles millions of requests per second and one that grinds to a halt.

How the notations relate to each other:

graph LR o["o(g)\nstrict upper bound"] O["O(g)\nupper bound"] T["Θ(g)\ntight bound"] W["Ω(g)\nlower bound"] w["ω(g)\nstrict lower bound"] o -->|implies| O T -->|implies| O T -->|implies| W w -->|implies| W

How fast do these actually grow? At input size $n = 8$:

xychart-beta title "Operations at n = 8 (O(2^n) = 256, off chart)" x-axis ["O(1)", "O(log n)", "O(n)", "O(n log n)", "O(n²)"] y-axis "Operations" 0 --> 70 bar [1, 3, 8, 24, 64]

Read Next: Data Structures · Searching Algorithms & Binary Search