Databases & Indexes
Prerequisite: Data Structures: Arrays to Queues · System Design Fundamentals
A database with no indexes is just an expensive file. Every query that filters or sorts must read every row - a full table scan - and at millions of rows that becomes the bottleneck. Indexes are the primary tool for making queries fast, and understanding how they work tells you both when to add them and when to leave them out.
Why Indexes Exist
Without an index, finding all orders for a given user requires reading every row in the orders table and checking the user_id column. That is $O(n)$ work. With a B-tree index on user_id, the database traverses a balanced tree to find the relevant leaf pages - $O(\log n)$ work, plus it reads only the matching rows.
At a table size of one thousand rows the difference is imperceptible. At one hundred million rows it is the difference between a 50ms query and one that takes minutes.
B-Tree Indexes
B-tree (balanced tree) is the default index type in PostgreSQL, MySQL, and most relational databases. The structure keeps data sorted and balanced so any key can be found in $O(\log n)$ comparisons regardless of table size.
B-trees support:
- Equality:
WHERE user_id = 42 - Range queries:
WHERE created_at BETWEEN '2024-01-01' AND '2024-03-31' - Prefix matching on strings:
WHERE username LIKE 'meg%' - ORDER BY and MIN/MAX without a sort step
Because the tree is sorted, range queries are efficient - the database finds the start of the range and scans forward through leaf pages.
Hash Indexes
Hash indexes store a hash of the indexed value pointing to the heap location. Lookup is $O(1)$ for equality queries. The trade-off: because hashing destroys ordering, hash indexes cannot serve range queries, prefix matches, or ORDER BY. They are useful for exact-match lookups on high-cardinality columns (UUIDs, session tokens) where you never need ranges.
PostgreSQL’s hash indexes were historically unreliable during crashes and only became WAL-safe in version 10. In practice most engineers default to B-trees even for equality-only workloads, since B-tree equality is only slightly slower and the index is more versatile.
When Indexes Hurt
Every index you add must be maintained on every write. When a row is inserted, updated, or deleted, the database updates all indexes on that table. A table with eight indexes takes roughly eight times as many writes as an unindexed table. This matters for:
- Bulk loads: Disable indexes, load data, rebuild. Loading with indexes active is dramatically slower.
- High write throughput tables: A time-series events table that receives thousands of inserts per second with six indexes on it may spend more time maintaining indexes than storing data.
- Small tables: If a table has five hundred rows, the query planner will often choose a sequential scan over an index anyway, since reading all pages costs less than following index pointers.
Composite Indexes
A composite index covers multiple columns: CREATE INDEX idx_orders_user_status ON orders(user_id, status). Column order is critical.
This index helps:
WHERE user_id = 42(leftmost prefix rule)WHERE user_id = 42 AND status = 'shipped'
This index does not help:
WHERE status = 'shipped'(skips the leading column)
Think of composite indexes like a phone book sorted by last name, then first name. You can efficiently find everyone named “Smith” (leading column), or everyone named “Smith, John” (both columns). But finding everyone named “John” without knowing their last name requires scanning the whole book.
When designing a composite index, put the column used in equality conditions first, range conditions last. Equality conditions narrow the search space most aggressively.
Covering Indexes and Index-Only Scans
If all columns a query needs are present in the index, the database can return results without touching the main table at all - an index-only scan. This eliminates the most expensive part of a query: the random heap reads to fetch the actual rows.
-- Index on (user_id, status, created_at)
SELECT status, created_at FROM orders WHERE user_id = 42;
-- Index-only scan: all needed columns are in the index
Adding extra columns to an index purely for coverage trades index size for query speed. Measure before adding - wide covering indexes increase write overhead.
Partial Indexes
A partial index covers only rows matching a condition:
CREATE INDEX idx_orders_pending ON orders(created_at)
WHERE status = 'pending';
This index is far smaller than a full index on created_at, which means faster scans and less write overhead. It is ideal when queries almost always filter on the same predicate - an is_deleted = false filter, an is_active = true guard, or a specific status value that represents a small, frequently queried subset.
Cardinality and NULL
Cardinality is the number of distinct values in a column. An index on a boolean column has cardinality two - every value is either true or false. For a low-cardinality column, an index scan may be slower than a sequential scan: if 40% of rows match your predicate, it is faster to read the table once than to follow index pointers to 40% of the rows scattered across many pages.
NULL values are excluded from most indexes by default. WHERE column IS NULL will not use a standard B-tree index. If you query for NULLs, either add a partial index that includes them explicitly or store a sentinel value instead.
Full-Text Search Indexes (GIN)
GIN (Generalized Inverted Index) indexes are designed for composite values - arrays, JSON, and full-text. For text search, GIN stores a mapping from each lexeme (stemmed word) to the rows containing it, enabling fast queries like:
SELECT * FROM articles WHERE to_tsvector('english', body)
@@ to_tsquery('english', 'database & index');
GIN indexes are slower to update than B-trees (they batch writes internally) but dramatically faster for text search than LIKE-based queries with a B-tree.
Examples
Reading EXPLAIN ANALYZE output:
EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 99;
Output excerpt:
Index Scan using idx_orders_user_id on orders
(cost=0.43..8.45 rows=3 width=72)
(actual time=0.032..0.041 rows=3 loops=1)
cost=0.43..8.45 is the planner’s estimate (startup cost..total cost in arbitrary units). actual time shows wall-clock milliseconds. When rows estimate differs wildly from actual rows, run ANALYZE orders to refresh statistics.
When the planner ignores your index:
-- Table: 500 rows. Query returns 400 of them.
-- Planner chooses Seq Scan even with an index on status.
SELECT * FROM tiny_table WHERE status != 'archived';
The planner correctly calculates that fetching 80% of rows via index pointer chasing costs more than one sequential pass. You can force an index with SET enable_seqscan = off to confirm - but if the planner’s choice is correct, leave it alone.
Composite index design for a common pattern:
An e-commerce orders table is frequently queried by user and date range:
SELECT * FROM orders
WHERE user_id = $1
AND created_at > NOW() - INTERVAL '30 days'
ORDER BY created_at DESC;
Optimal index: (user_id, created_at). User equality narrows to one user’s rows, then created_at range scan reads only recent rows in sorted order - ORDER BY requires no additional sort step.
Foreign keys are not automatically indexed in PostgreSQL. Always add an index on foreign key columns, especially on the child side of a one-to-many relationship. Without it, every delete from the parent table triggers a full scan of the child table to check for referencing rows.
Read Next: Transactions: 2PC vs Saga · Sharding & Data Replication