Divisibility & Primes - The Atoms of the Number System
Helpful context:
Every integer beyond 1 is either prime or built from primes. That sounds like an observation, but it is actually the central theorem of number theory: every positive integer factors uniquely into primes, and that uniqueness is not obvious. It requires proof. The fact that $12 = 2^2 \cdot 3$ and no other factorization exists is something we need to establish, not assume.
The right way into this subject is through divisibility. When $a$ divides $b$, it means $b$ is “made of” $a$ - $b$ is a whole-number multiple of $a$. This relation creates a partial order on the positive integers, and the primes are exactly the atoms of that order: the elements that cover only 1. Everything else is built from them. The Euclidean algorithm, Bezout’s identity, and the Fundamental Theorem of Arithmetic all follow from understanding this structure carefully.
Why does this matter beyond pure mathematics? The answer is cryptography. The hardness of factoring large composite numbers - the computational asymmetry between multiplying two large primes (easy) and recovering those primes from their product (hard) - is what makes RSA secure. That hardness lives in the number theory we develop here.
Divisibility
Definition. For integers $a$ and $b$ with $b \neq 0$, we say $b$ divides $a$, written $b \mid a$, if there exists an integer $k$ such that $a = kb$. Equivalently, $a$ is a multiple of $b$.
When $b \mid a$, we say $b$ is a divisor (or factor) of $a$, and $a$ is a multiple of $b$. If $b$ does not divide $a$, we write $b \nmid a$.
Examples. $3 \mid 12$ because $12 = 4 \cdot 3$. $5 \mid 0$ because $0 = 0 \cdot 5$. $7 \nmid 10$ because $10/7$ is not an integer.
The divisibility relation on the positive integers is a partial order: it is reflexive ($a \mid a$), antisymmetric (if $a \mid b$ and $b \mid a$ with $a, b > 0$, then $a = b$), and transitive (if $a \mid b$ and $b \mid c$, then $a \mid c$). The minimum element is 1 (which divides everything) and there is no maximum. The primes are the elements that only exceed 1 in this order.
Basic properties. For integers $a, b, c$ with $c \neq 0$:
- If $c \mid a$ and $c \mid b$, then $c \mid (a + b)$ and $c \mid (a - b)$.
- If $c \mid a$, then $c \mid ka$ for any integer $k$.
- If $c \mid a$ and $c \mid b$, then $c \mid (xa + yb)$ for any integers $x, y$.
Property 3 is the most useful: any linear combination $xa + yb$ of multiples of $c$ is also a multiple of $c$. This will appear in the proof of Bezout’s identity.
The Division Algorithm
Theorem (Division Algorithm). For any integers $a$ and $b$ with $b > 0$, there exist unique integers $q$ and $r$ satisfying
$$a = qb + r, \quad 0 \leq r < b.$$
The integer $q$ is the quotient and $r$ is the remainder.
Proof (existence). Consider the set $S = \{a - kb : k \in \mathbb{Z}, a - kb \geq 0\}$. This set is nonempty: taking $k$ sufficiently negative makes $a - kb$ large and positive. By the well-ordering principle, $S$ has a minimum element; call it $r = a - qb$ for some $q$.
We claim $r < b$. Suppose for contradiction that $r \geq b$. Then $r - b = a - (q+1)b \geq 0$, so $r - b \in S$. But $r - b < r$, contradicting minimality. Therefore $0 \leq r < b$.
Proof (uniqueness). Suppose $a = qb + r = q’b + r'$ with $0 \leq r, r' < b$. Then $(q - q')b = r' - r$. The right side satisfies $|r' - r| < b$, so $|q - q'| < 1$, forcing $q = q'$ and hence $r = r'$. $\square$
The uniqueness of $q$ and $r$ is what makes modular arithmetic well-defined: every integer has a unique representative in $\{0, 1, \ldots, b-1\}$ modulo $b$.
Greatest Common Divisor
Definition. The greatest common divisor of integers $a$ and $b$ (not both zero), written $\gcd(a, b)$, is the largest positive integer that divides both $a$ and $b$.
The key insight: $\gcd(a, b)$ is not just the largest common divisor - it is the generator of all common divisors. Every common divisor of $a$ and $b$ divides $\gcd(a, b)$.
The Euclidean Algorithm
Theorem. $\gcd(a, b) = \gcd(b, a \bmod b)$.
Proof. Write $a = qb + r$ where $r = a \bmod b$. Any common divisor $d$ of $a$ and $b$ satisfies $d \mid a$ and $d \mid b$, so $d \mid (a - qb) = r$. Hence $d$ divides both $b$ and $r$. Conversely, any common divisor of $b$ and $r$ divides $a = qb + r$. So the set of common divisors of $(a, b)$ equals the set of common divisors of $(b, r)$, and in particular their greatest elements agree. $\square$
This gives an efficient algorithm. Apply the theorem repeatedly:
$$\gcd(a, b) = \gcd(b, r_1) = \gcd(r_1, r_2) = \cdots = \gcd(r_{k-1}, 0) = r_{k-1}$$
where each $r_i = r_{i-2} \bmod r_{i-1}$ and the sequence $b > r_1 > r_2 > \cdots \geq 0$ terminates because the remainders are strictly decreasing nonneg integers.
Example. $\gcd(252, 105)$:
$$252 = 2 \cdot 105 + 42$$ $$105 = 2 \cdot 42 + 21$$ $$42 = 2 \cdot 21 + 0$$
So $\gcd(252, 105) = 21$.
Complexity. The Euclidean algorithm terminates in $O(\log(\min(a,b)))$ steps. The key fact is that after two steps, the remainder at least halves: $r_{i+2} < r_i / 2$. Since the inputs start at most $\min(a,b)$, the number of steps is $O(\log(\min(a,b)))$.
Bezout’s Identity
Theorem (Bezout). For any integers $a$ and $b$ (not both zero), there exist integers $x$ and $y$ such that
$$ax + by = \gcd(a, b).$$
In particular, $\gcd(a, b)$ is the smallest positive integer representable as a linear combination of $a$ and $b$.
Proof. Let $d$ be the smallest positive element of the set $L = \{ax + by : x, y \in \mathbb{Z}\}$. We claim $d = \gcd(a, b)$.
First, $d \mid a$: write $a = qd + r$ with $0 \leq r < d$. Then $r = a - qd = a - q(ax_0 + by_0) = a(1 - qx_0) + b(-qy_0) \in L$. Since $r < d$ and $d$ is the minimum positive element of $L$, we need $r = 0$, so $d \mid a$. By symmetry $d \mid b$.
Second, any common divisor of $a$ and $b$ divides $ax_0 + by_0 = d$. So $d$ is the greatest common divisor. $\square$
Finding $x$ and $y$: the extended Euclidean algorithm. Run the Euclidean algorithm forward to find $\gcd(a,b)$, then back-substitute to express each remainder as a linear combination of $a$ and $b$.
Example. Find $x, y$ with $252x + 105y = 21$.
From the forward pass: $42 = 252 - 2 \cdot 105$ and $21 = 105 - 2 \cdot 42$.
Back-substituting: $21 = 105 - 2 \cdot 42 = 105 - 2(252 - 2 \cdot 105) = 5 \cdot 105 - 2 \cdot 252$.
So $x = -2$, $y = 5$.
Why this matters. Bezout’s identity is the engine behind modular inverses. If $\gcd(a, n) = 1$, then there exist $x, y$ with $ax + ny = 1$, so $ax \equiv 1 \pmod{n}$: the integer $x$ is the multiplicative inverse of $a$ modulo $n$. The extended Euclidean algorithm computes this inverse directly.
Prime Numbers
Definition. An integer $p \geq 2$ is prime if its only positive divisors are $1$ and $p$. An integer $n \geq 2$ that is not prime is composite.
The first few primes: $2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, \ldots$
Euclid’s Lemma. If $p$ is prime and $p \mid ab$, then $p \mid a$ or $p \mid b$.
Proof. Suppose $p \mid ab$ and $p \nmid a$. Since $p$ is prime, its only divisors are $1$ and $p$, and since $p \nmid a$, we have $\gcd(p, a) = 1$. By Bezout’s identity, there exist integers $x, y$ with $px + ay = 1$. Multiply both sides by $b$:
$$pxb + aby = b.$$
Now $p \mid pxb$ (trivially) and $p \mid aby$ (since $p \mid ab$). Therefore $p \mid b$. $\square$
Euclid’s lemma extends by induction: if $p \mid a_1 a_2 \cdots a_k$, then $p \mid a_i$ for some $i$.
The Fundamental Theorem of Arithmetic
Theorem (FTA). Every integer $n \geq 2$ can be written as a product of prime numbers, and this representation is unique up to the order of factors.
Proof of existence (by strong induction). Base case: $n = 2$ is prime. Inductive step: assume every integer from $2$ to $n-1$ has a prime factorization. If $n$ is prime, we are done. Otherwise, $n = ab$ with $1 < a, b < n$. By the inductive hypothesis, both $a$ and $b$ have prime factorizations, and concatenating them gives a prime factorization of $n$.
Proof of uniqueness. Suppose $n = p_1 p_2 \cdots p_r = q_1 q_2 \cdots q_s$ are two factorizations into primes. Since $p_1 \mid q_1 q_2 \cdots q_s$, Euclid’s lemma implies $p_1 \mid q_j$ for some $j$. Reorder so $j = 1$. Since $q_1$ is prime, $p_1 = q_1$. Cancel to get $p_2 \cdots p_r = q_2 \cdots q_s$. Repeat. The two factorizations are identical up to order. $\square$
Euclid’s lemma is precisely what makes uniqueness work. Without the prime condition - for example, in the ring $\mathbb{Z}[\sqrt{-5}]$ where $6 = 2 \cdot 3 = (1 + \sqrt{-5})(1 - \sqrt{-5})$ - unique factorization can fail. Unique factorization is a gift of the integers, not a mathematical inevitability.
Infinitely Many Primes
Theorem (Euclid, c. 300 BCE). There are infinitely many prime numbers.
Proof. Suppose for contradiction that there are only finitely many primes $p_1, p_2, \ldots, p_k$. Consider
$$N = p_1 p_2 \cdots p_k + 1.$$
By the FTA, $N$ has at least one prime divisor $p$. But $p$ cannot be any of $p_1, \ldots, p_k$: if $p = p_i$, then $p \mid p_1 \cdots p_k$ and $p \mid N$, so $p \mid (N - p_1 \cdots p_k) = 1$, which is impossible since $p \geq 2$. So $p$ is a prime not in our list, contradicting the assumption that $p_1, \ldots, p_k$ are all primes. $\square$
Note: $N$ itself need not be prime. For example, $2 \cdot 3 \cdot 5 \cdot 7 \cdot 11 \cdot 13 + 1 = 30031 = 59 \cdot 509$. The argument only says $N$ has a prime factor outside the list.
Distribution of Primes
Primes become less frequent as numbers grow. Among the first 100 integers, 25 are prime (25%). Among integers near $10^6$, roughly 1 in 14 is prime. Near $10^9$, roughly 1 in 20. This thinning is governed by a precise theorem.
Prime Number Theorem (qualitative). The number of primes up to $x$, denoted $\pi(x)$, satisfies
$$\pi(x) \sim \frac{x}{\ln x} \quad \text{as } x \to \infty.$$
The symbol $\sim$ means the ratio $\pi(x) / (x / \ln x)$ tends to 1. Equivalently, the $n$-th prime $p_n$ satisfies $p_n \sim n \ln n$.
The consequence: a random integer near $x$ is prime with probability approximately $1/\ln x$. For $x = 2^{1024}$, this probability is approximately $1/(1024 \cdot \ln 2) \approx 1/710$. Generating a 1024-bit prime requires testing roughly 700 random odd numbers on average - very feasible.
The Prime Number Theorem is deep; its proof requires complex analysis (see the post on the prime number theorem). But the qualitative statement is what matters for applications.
Why Primes Matter for Cryptography
The security of RSA rests on a single asymmetry: it is easy to multiply two large primes $p$ and $q$ to form $n = pq$, but computationally hard to recover $p$ and $q$ from $n$ alone (for large enough $n$). This is the integer factorization problem. No polynomial-time algorithm for it is known.
Why is factoring hard? The best known classical algorithm, the General Number Field Sieve, factors an $n$-bit number in roughly
$$\exp\left(O\left(n^{1/3} (\log n)^{2/3}\right)\right)$$
operations - sub-exponential but far from polynomial. For $n = 2048$ bits, this is astronomically large.
The connection to primes via Euclid’s lemma runs through all of the following: modular inverses (Bezout), RSA correctness (Fermat/Euler), Diffie-Hellman (discrete logarithm in $\mathbb{Z}_p^*$), and elliptic curve cryptography. Number theory is not background material for cryptography - it is cryptography.
Summary
| Concept | Statement | Why it matters |
|---|---|---|
| Divisibility $b \mid a$ | $a = kb$ for some integer $k$ | Partial order on $\mathbb{Z}^+$; basis for all that follows |
| Division algorithm | $a = qb + r$, unique $q, r$ with $0 \leq r < b$ | Makes modular arithmetic well-defined |
| Euclidean algorithm | $\gcd(a,b) = \gcd(b, a \bmod b)$, $O(\log \min)$ steps | Efficient GCD; basis for extended algorithm |
| Bezout’s identity | $ax + by = \gcd(a,b)$ for some integers $x, y$ | Computes modular inverses |
| Euclid’s lemma | $p \mid ab \Rightarrow p \mid a$ or $p \mid b$ (prime $p$) | Key lemma for unique factorization |
| Fundamental Theorem | Unique prime factorization for all $n \geq 2$ | Primes are the atoms of $\mathbb{Z}^+$ |
| Infinitely many primes | $N = p_1 \cdots p_k + 1$ gives a new prime factor | The supply of primes never runs out |
| Prime Number Theorem | $\pi(x) \sim x / \ln x$ | Primes near $x$ have density $\approx 1/\ln x$ |
Read next: