Integration
Prerequisite:
Integration is the process of accumulating a quantity over a continuous range. It began as a geometric question - what is the area under a curve? - and matured into a rigorous theory that underpins probability, Fourier analysis, differential equations, and numerical computation.
The Area Problem and Riemann Sums
Let $f: [a,b] \to \mathbb{R}$ be bounded. A partition $P$ of $[a,b]$ is a finite set $a = x_0 < x_1 < \cdots < x_n = b$. On each subinterval $[x_{i-1}, x_i]$ of width $\Delta x_i = x_i - x_{i-1}$, pick a sample point $x_i^\ast$. The Riemann sum is
$$S(f, P) = \sum_{i=1}^{n} f(x_i^\ast),\Delta x_i.$$
Common choices: $x_i^\ast = x_{i-1}$ (left), $x_i^\ast = x_i$ (right), $x_i^\ast = (x_{i-1}+x_i)/2$ (midpoint).
f(x)
| ___
| ___| |___
|___| |___
| | | | | |
a x1 x2 x3 x4 b
Each rectangle: height = f(x_i*), width = Δx_i
Riemann sum ≈ total shaded area
As the mesh $|P| = \max_i \Delta x_i \to 0$, these approximations should converge to the true area.
The Riemann Integral
Definition. For partition $P$ define the upper sum $U(f,P) = \sum_{i=1}^n M_i,\Delta x_i$ and lower sum $L(f,P) = \sum_{i=1}^n m_i,\Delta x_i$, where $M_i = \sup_{[x_{i-1},x_i]} f$ and $m_i = \inf_{[x_{i-1},x_i]} f$.
Observe that $L(f,P) \leq S(f,P) \leq U(f,P)$ for any choice of sample points, and that refinement can only increase lower sums and decrease upper sums. Hence
$$\sup_P L(f,P) \leq \inf_P U(f,P).$$
Definition (Riemann Integrability). $f$ is Riemann integrable on $[a,b]$ if $\sup_P L(f,P) = \inf_P U(f,P)$. The common value is the Riemann integral $\int_a^b f(x),dx$.
Equivalently (Riemann’s criterion): $f$ is integrable iff for every $\varepsilon > 0$ there exists a partition $P$ with $U(f,P) - L(f,P) < \varepsilon$.
Theorem. Every continuous function on $[a,b]$ is Riemann integrable.
Proof sketch. A continuous function on a closed bounded interval is uniformly continuous. Given $\varepsilon > 0$, choose $\delta$ so that $|x-y| < \delta \Rightarrow |f(x)-f(y)| < \varepsilon/(b-a)$. Take any partition with mesh $< \delta$. Then on each subinterval $M_i - m_i < \varepsilon/(b-a)$, so $U - L < \varepsilon$. $\blacksquare$
Note: monotone functions are also integrable (the oscillation argument is even simpler). A bounded function with finitely many discontinuities is integrable. A continuous nowhere-differentiable function is still integrable. Integrability is a weaker condition than differentiability.
Properties of the Integral
Linearity. $\int_a^b (\alpha f + \beta g) = \alpha\int_a^b f + \beta\int_a^b g$.
Additivity. $\int_a^b f = \int_a^c f + \int_c^b f$ for any $c \in [a,b]$.
Comparison. If $f \leq g$ on $[a,b]$, then $\int_a^b f \leq \int_a^b g$. In particular $\left|\int_a^b f\right| \leq \int_a^b |f|$.
Convention. $\int_b^a f = -\int_a^b f$ and $\int_a^a f = 0$.
The Fundamental Theorem of Calculus
This is the central result connecting differentiation and integration. The two operations are, in a precise sense, inverses of each other.
Part 1: Differentiation of an Integral
Theorem (FTC Part 1). Let $f$ be continuous on $[a,b]$, and define $F(x) = \int_a^x f(t),dt$. Then $F$ is differentiable on $(a,b)$ and $F'(x) = f(x)$.
Proof. Fix $x \in (a,b)$ and compute the difference quotient for small $h \neq 0$:
$$\frac{F(x+h) - F(x)}{h} = \frac{1}{h}\int_x^{x+h} f(t),dt.$$
Since $f$ is continuous at $x$, for any $\varepsilon > 0$ there exists $\delta > 0$ with $|f(t) - f(x)| < \varepsilon$ whenever $|t - x| < \delta$. For $|h| < \delta$:
$$\left|\frac{F(x+h)-F(x)}{h} - f(x)\right| = \left|\frac{1}{h}\int_x^{x+h}(f(t)-f(x)),dt\right| \leq \frac{1}{|h|}\cdot\varepsilon\cdot|h| = \varepsilon.$$
Taking $h \to 0$ gives $F'(x) = f(x)$. $\blacksquare$
Subtlety: Part 1 requires $f$ to be continuous. Without continuity, $F$ may still be differentiable a.e. (Lebesgue theory), but pointwise equality $F'(x) = f(x)$ can fail.
Part 2: Evaluation via Antiderivatives
Theorem (FTC Part 2). Let $f$ be Riemann integrable on $[a,b]$, and suppose $F$ is an antiderivative of $f$ (i.e., $F' = f$) on $[a,b]$. Then
$$\int_a^b f(x),dx = F(b) - F(a).$$
Proof. Let $G(x) = \int_a^x f(t),dt$. By Part 1 (continuity hypothesis on $f$ suffices for this), $G' = f = F'$. Hence $(F - G)' = 0$ on $(a,b)$, so $F - G$ is constant: $F(x) - G(x) = F(a) - G(a) = F(a) - 0 = F(a)$. Evaluating at $x = b$ gives $G(b) = F(b) - F(a)$. $\blacksquare$
This theorem converts computing integrals from a limiting process (Riemann sums) into an algebraic operation (evaluate an antiderivative). The challenge shifts to finding $F$.
Integration Techniques
Substitution (Change of Variables)
If $u = g(x)$ is differentiable and $f$ is continuous:
$$\int_a^b f(g(x)),g'(x),dx = \int_{g(a)}^{g(b)} f(u),du.$$
This is the chain rule read backwards. The condition $g'$ does not change sign on $[a,b]$ ensures $g$ is injective and the substitution is valid; without it one must break the interval.
Example. $\int_0^1 2x,e^{x^2}dx$. Set $u = x^2$, $du = 2x,dx$; limits $0 \to 0$, $1 \to 1$. Result: $\int_0^1 e^u,du = e - 1$.
Integration by Parts
For differentiable $u$ and $v$, the product rule $(uv)' = u’v + uv'$ integrates to
$$\int_a^b u(x),v'(x),dx = \bigl[u(x)v(x)\bigr]_a^b - \int_a^b u'(x),v(x),dx.$$
Written with differentials: $\int u,dv = uv - \int v,du$.
The choice of $u$ and $dv$ is an art. A useful heuristic (LIATE): prefer Logarithms, Inverse trig, Algebraic, Trig, Exponential as $u$ in that priority order.
Example. $\int x e^x dx$: let $u = x$, $dv = e^x dx$. Then $du = dx$, $v = e^x$. Result: $xe^x - \int e^x dx = xe^x - e^x + C$.
Partial Fractions
To integrate a rational function $P(x)/Q(x)$ where $\deg P < \deg Q$, factor $Q$ over $\mathbb{R}$ and decompose:
$$\frac{P(x)}{(x-r_1)^{m_1}\cdots(x^2+p_1x+q_1)^{k_1}\cdots} = \sum \frac{A_{ij}}{(x-r_i)^j} + \sum \frac{B_{ij}x + C_{ij}}{(x^2+p_ix+q_i)^j}.$$
Each term integrates to a logarithm or arctangent. The partial fraction decomposition exists and is unique by the Chinese Remainder Theorem for polynomials.
Trigonometric Substitution
For integrands involving $\sqrt{a^2 - x^2}$, $\sqrt{a^2 + x^2}$, or $\sqrt{x^2 - a^2}$, use:
| Form | Substitution | Identity used |
|---|---|---|
| $\sqrt{a^2-x^2}$ | $x = a\sin\theta$ | $1-\sin^2\theta = \cos^2\theta$ |
| $\sqrt{a^2+x^2}$ | $x = a\tan\theta$ | $1+\tan^2\theta = \sec^2\theta$ |
| $\sqrt{x^2-a^2}$ | $x = a\sec\theta$ | $\sec^2\theta - 1 = \tan^2\theta$ |
Standard Integrals and Average Value
Key antiderivatives: $\int x^n dx = x^{n+1}/(n+1)+C$ (for $n \neq -1$), $\int x^{-1}dx = \ln|x|+C$, $\int e^x dx = e^x + C$, $\int \sin x,dx = -\cos x + C$, $\int \cos x,dx = \sin x + C$, $\int \sec^2 x,dx = \tan x + C$, $\int \frac{dx}{1+x^2} = \arctan x + C$.
The average value of $f$ on $[a,b]$ is
$$\bar{f} = \frac{1}{b-a}\int_a^b f(x),dx.$$
By the Mean Value Theorem for Integrals: if $f$ is continuous on $[a,b]$, there exists $c \in (a,b)$ with $f(c) = \bar{f}$.
Geometric Applications
Area between curves. If $f(x) \geq g(x)$ on $[a,b]$:
$$A = \int_a^b (f(x) - g(x)),dx.$$
When the curves cross, split the interval at intersection points and sum absolute values.
Volumes of revolution. Revolving $y = f(x)$ around the $x$-axis on $[a,b]$:
- Disk method: $V = \pi\int_a^b [f(x)]^2,dx$. Each cross-section is a disk of radius $f(x)$.
- Washer method (region between $f$ and $g$, $f \geq g \geq 0$): $V = \pi\int_a^b ([f(x)]^2 - [g(x)]^2),dx$.
- Shell method (revolving around $y$-axis): $V = 2\pi\int_a^b x,f(x),dx$.
Each formula follows from integrating the volume of infinitesimal slices or shells - i.e., from the definition of the Riemann integral applied to a volume element.
Examples
When an antiderivative is unavailable in closed form, numerical methods approximate $\int_a^b f(x),dx$:
Trapezoidal rule. Approximate $f$ on each subinterval by a linear function:
$$T_n = \frac{b-a}{n}\left(\frac{f(a)+f(b)}{2} + \sum_{i=1}^{n-1} f(x_i)\right).$$
Error: $O(h^2)$ where $h = (b-a)/n$, assuming $f''$ is bounded.
Simpson’s rule. Approximate $f$ on pairs of subintervals by quadratics:
$$S_n = \frac{h}{3}\left(f(x_0) + 4f(x_1) + 2f(x_2) + 4f(x_3) + \cdots + f(x_n)\right),\quad n \text{ even}.$$
Error: $O(h^4)$. Simpson’s rule is exact for polynomials of degree $\leq 3$.
Monte Carlo integration. For high-dimensional integrals $\int_\Omega f(\mathbf{x}),d\mathbf{x}$, sample $N$ points $\mathbf{x}_1, \ldots, \mathbf{x}_N$ uniformly from $\Omega$ and estimate
$$\int_\Omega f \approx \text{Vol}(\Omega) \cdot \frac{1}{N}\sum_{i=1}^N f(\mathbf{x}_i).$$
Error: $O(N^{-1/2})$ regardless of dimension. For $d$-dimensional integrals, trapezoidal rule needs $N \sim h^{-d}$ points to achieve error $O(h^2)$ - exponential in $d$. Monte Carlo’s dimension-independent convergence makes it indispensable for integrals in dozens or hundreds of dimensions (as in path-integral methods and Bayesian inference).
Read Next: