Linear Programming & Simplex - Optimizing Along the Edges of a Polytope // Megha Bose

Helpful context:

Linear programming is perhaps the most practically important class of optimization problems in existence. Every day, airlines assign crews to flights, factories schedule production, financial institutions manage portfolios, and logistics companies route shipments - all by solving linear programs. The problems are large (millions of variables), the time pressure is real, and the solutions have to be optimal. What makes this possible is the simplex algorithm, a method that has no business working as well as it does in practice, yet routinely solves problems with hundreds of thousands of variables in seconds.

The mathematical theory behind LP is genuinely beautiful. At its heart is a remarkable coincidence: the feasible set of a linear program is a convex polyhedron, and the optimal solution - when it exists - always lives at a vertex. Optimizing over an infinite set reduces to searching over a finite combinatorial structure. The simplex algorithm exploits this by walking along edges of the polyhedron from vertex to vertex, improving the objective at each step. The theory tells us exactly when this walk terminates, why it terminates at the right place, and what happens when the problem has no solution at all.

This post develops LP theory from scratch: the geometry of feasible sets and extreme points, the algebra of bases and reduced costs, the mechanics of the simplex pivot, and the elegant duality theory that connects every minimization problem to a paired maximization problem. The proofs are included because the geometry is inseparable from the algebra - you cannot truly understand why the simplex algorithm works without seeing both.

Standard Form

A linear program in standard form is:

$$\min_{x \in \mathbb{R}^n} ; c^\top x \quad \text{subject to} \quad Ax = b, ; x \geq 0$$

where $A \in \mathbb{R}^{m \times n}$, $b \in \mathbb{R}^m$, $c \in \mathbb{R}^n$. We assume throughout that $\text{rank}(A) = m$ - if $A$ has redundant rows they can be removed without changing the feasible set.

Standard form might look restrictive. In practice, any LP can be converted to it.

Inequality constraints $\leq$. Replace $a^\top x \leq b_i$ with $a^\top x + s_i = b_i$, $s_i \geq 0$. The variable $s_i$ is a slack variable that absorbs the difference.

Maximization. Replace $\max c^\top x$ with $\min (-c)^\top x$. The optimal point is identical; only the objective value flips sign.

Free variables. If $x_j$ is unrestricted in sign, write $x_j = x_j^+ - x_j^-$ with $x_j^+, x_j^- \geq 0$. Replace every occurrence of $x_j$ with this difference. Both $x_j^+$ and $x_j^-$ are now nonneg variables, so they fit standard form.

After these substitutions, every LP becomes a problem of minimizing a linear function over the intersection of an affine subspace ($Ax = b$) and the nonneg orthant ($x \geq 0$). The feasible set is:

$$P = \{x \in \mathbb{R}^n \mid Ax = b,; x \geq 0\}$$

This is a polyhedron - the intersection of finitely many halfspaces. The equality constraints $Ax = b$ are $m$ hyperplane intersections; the nonnegativity constraints $x_j \geq 0$ are $n$ halfspace constraints. Geometrically, $P$ is a convex set, and when it is also bounded it is a polytope.

Basic Feasible Solutions

The key structural insight is that a linear objective over a polyhedron, if it has a finite optimum, achieves it at a vertex of $P$. The algebraic definition of a vertex is a basic feasible solution.

Basis and basic solution. A basis $B$ is a set of $m$ indices from $\{1, \ldots, n\}$ such that the corresponding $m$ columns of $A$ are linearly independent. Partition:

$$A = [B \mid N], \quad c^\top = [c_B^\top \mid c_N^\top], \quad x^\top = [x_B^\top \mid x_N^\top]$$

where $B$ is the $m \times m$ invertible basis matrix and $N$ is the remaining $m \times (n-m)$ nonbasic matrix. The constraint $Ax = b$ becomes:

$$B x_B + N x_N = b$$

Setting $x_N = 0$ (all nonbasic variables to zero) uniquely determines:

$$x_B = B^{-1} b$$

This is the basic solution associated with basis $B$.

Basic feasible solution (BFS). A basic solution is a BFS if $x_B = B^{-1}b \geq 0$ - every basic variable is nonneg. Denote $\bar{b} = B^{-1}b$.

Degeneracy. A BFS is degenerate if some component of $\bar{b}$ is exactly zero. Geometrically, this means more than $n - m$ variables are zero, so the vertex lies at the intersection of more than the minimum number of hyperplanes. Degeneracy causes trouble for the simplex algorithm (discussed below).

Finding an initial BFS. If $A$ contains an $m \times m$ identity submatrix - e.g., from slack variables added to convert inequality constraints - that identity gives an immediate initial basis. Otherwise, the two-phase method handles the general case.

Extreme Point Characterization

The connection between BFSs and vertices is exact. Recall that a point $x$ in a convex set $P$ is an extreme point (vertex) if it cannot be written as a strict convex combination of two other points in $P$: there are no $y, z \in P$ with $y \neq z$ and $x = \lambda y + (1-\lambda)z$ for $\lambda \in (0,1)$.

Theorem. $x$ is a BFS of $P = \{x \mid Ax = b, x \geq 0\}$ if and only if $x$ is an extreme point of $P$.

Proof ($\Rightarrow$, BFS $\to$ extreme). Let $x$ be a BFS with basis $B$. Suppose $x = \lambda y + (1-\lambda)z$ with $y, z \in P$ and $\lambda \in (0,1)$. For each nonbasic index $j \notin B$: $x_j = 0$, and since $y_j \geq 0$, $z_j \geq 0$, the combination $0 = \lambda y_j + (1-\lambda) z_j$ forces $y_j = z_j = 0$. So $y$ and $z$ both have zero nonbasic components. Now $Ay = b$ becomes $B y_B = b = B x_B$, and since $B$ is invertible, $y_B = x_B$. The same argument gives $z_B = x_B$. Therefore $y = z = x$, so $x$ is extreme. $\square$

Proof ($\Leftarrow$, extreme $\to$ BFS). Let $x$ be extreme. We show $x$ is a BFS by showing the columns of $A$ indexed by the support $S = \{j : x_j > 0\}$ are linearly independent (so $|S| \leq m$ and $x$ is basic).

Suppose for contradiction that the columns $\{a_j : j \in S\}$ are linearly dependent. Then there exists $d \in \mathbb{R}^n$ with $d_j = 0$ for $j \notin S$ and $\sum_{j \in S} d_j a_j = Ad = 0$, $d \neq 0$. Define $y = x + \varepsilon d$ and $z = x - \varepsilon d$ for small $\varepsilon > 0$. Both satisfy $Ay = Az = b$. Since $d_j = 0$ outside $S$ and $x_j > 0$ on $S$, for small enough $\varepsilon$ we have $y_j \geq 0$ and $z_j \geq 0$ for all $j$. So $y, z \in P$, and $x = \frac{1}{2}y + \frac{1}{2}z$ with $y \neq z$ (since $d \neq 0$). This contradicts extremality of $x$.

Therefore the columns of $A$ indexed by $S$ are linearly independent. Since $\text{rank}(A) = m$, we have $|S| \leq m$. Extend $S$ to a full basis $B$ of size $m$ by adding any linearly independent columns. The basic solution with basis $B$ has $x_B = B^{-1}b \geq 0$ (the support variables are positive, and any added variables have $x_j = 0 \geq 0$). So $x$ is a BFS. $\square$

This theorem is the foundation of everything. It says: the geometric objects we care about (vertices of the polyhedron) coincide exactly with the algebraic objects we can compute with (basic feasible solutions). We can parameterize all vertices by their bases.

Extreme Directions

For bounded polyhedra (polytopes), the only LP optima are extreme points. But $P$ can be unbounded, in which case the objective may be $-\infty$, and this is detected through extreme directions.

Feasible direction. A vector $d \in \mathbb{R}^n$ is a feasible direction from $x \in P$ if $x + t d \in P$ for all $t \geq 0$. For $P = \{x \mid Ax = b, x \geq 0\}$, this requires $Ad = 0$ and $x_j + t d_j \geq 0$ for all $j \geq 0$ and all $t \geq 0$ - which means $d_j \geq 0$ for all $j$ with $x_j = 0$.

Extreme direction. A direction $d \geq 0$ with $Ad = 0$, $d \neq 0$, is an extreme direction if it cannot be written as a positive combination of two distinct feasible directions.

Extreme directions arise naturally from the simplex algorithm. Given a BFS with basis $B$, consider increasing the $j$-th nonbasic variable. From $Bx_B + Nx_N = b$, increasing $x_j$ by $t$ forces $x_B$ to change by $-B^{-1}a_j \cdot t$. Define $\bar{a}_{j} = B^{-1}a_j$ (the $j$-th column of $B^{-1}N$). If $\bar{a}_{j} \leq 0$ componentwise (all entries nonpositive), then increasing $x_j$ never violates $x_B \geq 0$ - we can push $t \to \infty$. The direction $d$ with $d_j = 1$, $d_B = -\bar{a}_{j}$, $d_{k} = 0$ for other nonbasic $k$, satisfies $Ad = 0$, $d \geq 0$, and is an extreme direction. If $c^\top d < 0$, moving along this ray drives the objective to $-\infty$ and the problem is unbounded.

The Representation Theorem

The extreme points and extreme directions together give a complete description of any nonempty polyhedron.

Theorem (Resolution / Representation). Every $x$ in a nonempty polyhedron $P = \{x \mid Ax = b, x \geq 0\}$ can be written as:

$$x = \sum_{k} \lambda_k v^{(k)} + \sum_{\ell} \mu_\ell d^{(\ell)}$$

where $v^{(k)}$ are the extreme points of $P$, $d^{(\ell)}$ are the extreme directions, $\lambda_k \geq 0$, $\sum_k \lambda_k = 1$ (convex combination), and $\mu_\ell \geq 0$. If $P$ is bounded (a polytope), there are no extreme directions and $P = \text{conv}(\{v^{(k)}\})$.

The proof proceeds by induction on the number of support variables. The base case is a BFS. For a general $x$ with more than $m$ nonzero variables, the columns in the support are dependent, so $x$ lies on a segment within $P$ between two points with strictly fewer nonzero variables. Iterating drives to BFSs.

Consequence for LP. Since every feasible point is a convex combination of extreme points plus extreme directions, the minimum of a linear function over $P$ is achieved at an extreme point (if the problem is bounded). There are only finitely many extreme points (at most $\binom{n}{m}$, one per basis), so we only need to check those. The simplex algorithm does this efficiently.

Optimality Conditions

Given a BFS with basis $B$, what does optimality look like algebraically?

Objective decomposition. The current BFS has $x_N = 0$ and $x_B = B^{-1}b = \bar{b}$. The current objective value is:

$$c^\top x = c_B^\top x_B = c_B^\top B^{-1}b$$

For a general feasible $x$, express $x_B$ using the constraint $Bx_B = b - Nx_N$:

$$x_B = B^{-1}b - B^{-1}N x_N = \bar{b} - \bar{N} x_N$$

where $\bar{N} = B^{-1}N$. Substituting into the objective:

$$c^\top x = c_B^\top x_B + c_N^\top x_N = c_B^\top(\bar{b} - \bar{N}x_N) + c_N^\top x_N = c_B^\top \bar{b} + (c_N^\top - c_B^\top \bar{N})x_N$$

Define the reduced cost of nonbasic variable $j$ as:

$$\bar{c}_j = c_j - c_B^\top B^{-1} a_j$$

This can also be written $\bar{c}_j = c_j - y^\top a_j$ where $y = (B^{-1})^\top c_B$ is the vector of dual variables. The objective becomes:

$$c^\top x = z_0 + \sum_{j \notin B} \bar{c}_j x_j, \quad z_0 = c_B^\top B^{-1} b$$

Optimality condition. The current BFS is optimal for the LP if and only if $\bar{c}_j \geq 0$ for all nonbasic $j$.

Proof. ($\Leftarrow$) If all reduced costs are nonneg and all $x_j \geq 0$ for $j \notin B$, then $c^\top x = z_0 + \sum_j \bar{c}_j x_j \geq z_0$ for any feasible $x$. So the BFS value $z_0$ is optimal.

($\Rightarrow$) If some $\bar{c}_j < 0$, increase $x_j$ from 0 to $\varepsilon > 0$ while adjusting $x_B$ to maintain feasibility. The objective changes by $\bar{c}_j \varepsilon < 0$ - the current BFS is not optimal. $\square$

The geometric intuition: the reduced cost $\bar{c}_j$ is the rate of change of the objective as we move away from the current vertex along the edge in the direction of variable $j$. Negative reduced cost means that edge goes downhill. Zero means the edge is level (degenerate). Positive means uphill.

The Simplex Algorithm

The simplex algorithm is a pivot method: starting from a BFS, it moves to an adjacent BFS with a lower (or equal) objective, until no improving pivot exists.

Setup: the simplex tableau. Augment the system to track the objective. The tableau is:

$$\begin{pmatrix} B^{-1}A & B^{-1}b \\ \bar{c}_N^\top - \bar{c}_B^\top B^{-1}A & -z_0 \end{pmatrix} \longrightarrow \begin{pmatrix} \bar{a}_{B_1,1} & \cdots & \bar{a}_{B_1,n} & \bar{b}_1 \\ \vdots & & \vdots & \vdots \\ \bar{a}_{B_m,1} & \cdots & \bar{a}_{B_m,n} & \bar{b}_m \\ \bar{c}_1 & \cdots & \bar{c}_n & -z_0 \end{pmatrix}$$

In the tableau, the $B$-columns form an identity (after basis row operations), the right-hand column gives $\bar{b} = B^{-1}b$, and the bottom row gives reduced costs (zero for basic variables, $\bar{c}_j$ for nonbasic). The current objective value is $z_0 = -(\text{bottom-right entry})$.

Algorithm:

While there exists a nonbasic $j$ with $\bar{c}_j < 0$:

Choose entering variable: Select any nonbasic $j$ with $\bar{c}_j < 0$. Compute $\bar{a}_j = B^{-1}a_j$ (the $j$-th column of the current tableau).
Ratio test: If $\bar{a}_{ij} \leq 0$ for all $i$, the problem is unbounded - stop. Otherwise, compute:

$$\theta^{\ast} = \min\left\{\frac{\bar{b}_i}{\bar{a}_{ij}} : \bar{a}_{ij} > 0\right\}$$

The leaving variable is the basic index $r$ achieving this minimum: $\theta^{\ast} = \bar{b}_{r} / \bar{a}_{rj}$.

Pivot: Remove $B_r$ from the basis, add $j$. Update the tableau by dividing row $r$ by $\bar{a}_{rj}$, then eliminating $x_j$ from all other rows (Gaussian elimination). The new basis matrix is $B' = B$ with column $B_r$ replaced by $a_j$.

Repeat until all $\bar{c}_j \geq 0$.

Why the ratio test? Increasing $x_j$ by $\theta$ forces $x_{B_i} \to \bar{b}_{i} - \bar{a}_{ij}\theta$. To maintain $x_{B_i} \geq 0$ for each $i$ with $\bar{a}_{ij} > 0$, we need $\theta \leq \bar{b}_{i} / \bar{a}_{ij}$. The binding constraint (smallest ratio) determines how far we can move before a basic variable hits zero and must leave the basis.

Worked Example

Consider the LP: minimize $-3x_1 - 5x_2$ subject to $x_1 \leq 4$, $x_2 \leq 6$, $3x_1 + 2x_2 \leq 18$, $x_1, x_2 \geq 0$.

Add slacks $s_1, s_2, s_3 \geq 0$ to get standard form. Initial BFS: $x_1 = x_2 = 0$, $s_1 = 4$, $s_2 = 6$, $s_3 = 18$ (basis = $\{s_1, s_2, s_3\}$).

Initial tableau (columns: $x_1, x_2, s_1, s_2, s_3$ | rhs):

$$\begin{array}{ccccc|c} x_1 & x_2 & s_1 & s_2 & s_3 & \bar{b} \\ \hline 1 & 0 & 1 & 0 & 0 & 4 \\ 0 & 1 & 0 & 1 & 0 & 6 \\ 3 & 2 & 0 & 0 & 1 & 18 \\ \hline -3 & -5 & 0 & 0 & 0 & 0 \end{array}$$

Reduced costs: $\bar{c}_1 = -3$, $\bar{c}_2 = -5$. Both negative; pick $j = 2$ (most negative).

Ratio test for $x_2$: Column is $(0, 1, 2)^\top$. Ratios: row 1 gives $\infty$ (since $\bar{a}_{12} = 0$), row 2 gives $6/1 = 6$, row 3 gives $18/2 = 9$. Minimum is $6$, so $s_2$ leaves (row 2). Pivot on entry $(2, 2)$.

Pivot: Row 2 is already normalized (pivot element = 1). Eliminate $x_2$ from row 3 and bottom row: Row 3 $\leftarrow$ Row 3 $- 2 \cdot$ Row 2; Bottom $\leftarrow$ Bottom $+ 5 \cdot$ Row 2.

$$\begin{array}{ccccc|c} x_1 & x_2 & s_1 & s_2 & s_3 & \bar{b} \\ \hline 1 & 0 & 1 & 0 & 0 & 4 \\ 0 & 1 & 0 & 1 & 0 & 6 \\ 3 & 0 & 0 & -2 & 1 & 6 \\ \hline -3 & 0 & 0 & 5 & 0 & 30 \end{array}$$

Current BFS: $x_2 = 6$, $s_1 = 4$, $s_3 = 6$; objective $= -30$. Still $\bar{c}_1 = -3 < 0$; enter $x_1$.

Ratio test for $x_1$: Column is $(1, 0, 3)^\top$. Ratios: row 1 gives $4/1 = 4$, row 2 gives $\infty$, row 3 gives $6/3 = 2$. Minimum is $2$, so $s_3$ leaves (row 3). Pivot on entry $(3,1)$.

Pivot: Row 3 $\leftarrow$ Row 3 $/3$; Row 1 $\leftarrow$ Row 1 $-$ Row 3; Bottom $\leftarrow$ Bottom $+ 3 \cdot$ Row 3.

$$\begin{array}{ccccc|c} x_1 & x_2 & s_1 & s_2 & s_3 & \bar{b} \\ \hline 0 & 0 & 1 & 2/3 & -1/3 & 2 \\ 0 & 1 & 0 & 1 & 0 & 6 \\ 1 & 0 & 0 & -2/3 & 1/3 & 2 \\ \hline 0 & 0 & 0 & 3 & 1 & 36 \end{array}$$

All reduced costs $\geq 0$. Optimal: $x_1 = 2$, $x_2 = 6$, objective $= -36$ (i.e., maximum of $3x_1 + 5x_2 = 36$). $\square$

Degeneracy, Cycling, and Bland’s Rule

Degeneracy and degenerate pivots. When $\bar{b}_r = 0$ for the leaving variable, the ratio test gives $\theta^* = 0$. The entering variable enters at value 0, the leaving variable exits at value 0, and the objective does not change. Multiple consecutive degenerate pivots can occur without any progress.

Cycling. In degenerate cases, the simplex algorithm can cycle: return to a previously visited basis, creating an infinite loop. Cycling is not just a theoretical curiosity - it occurs in practice on specially constructed problems, and without protection against it, the algorithm may never terminate.

Bland’s Rule. To guarantee termination, use the following tie-breaking rule:

Entering variable: Among all nonbasic $j$ with $\bar{c}_j < 0$, choose the one with the smallest index.
Leaving variable: Among all basic variables achieving the minimum ratio, choose the one with the smallest index.

Theorem (Bland, 1977). Simplex with Bland’s rule never cycles.

Proof sketch. Suppose for contradiction that cycling occurs. Consider the last basis in the cycle before it begins to repeat. Among all variables that ever enter or leave the basis during the cycle, let $t$ be the one with the largest index that enters at some point. Let $B'$ be the basis just before $t$ enters. In that step, $t$ is chosen by Bland’s rule as the entering variable - so all other nonbasic variables with smaller indices have nonneg reduced cost at $B'$. Since we are in a cycle, there must be a step later where $t$ is chosen to leave. At that step, let $s$ be the entering variable. Because $s$ was not chosen to enter at $B'$ (only $t$ was, and $t$ has a larger index than $s$ by Bland’s entering rule), we get a contradiction about how $s$ relates to the current reduced costs and the cycling structure. The full argument formalizes this via a linear algebra argument on the change in reduced costs through the cycle, showing that $t$’s reduced cost at the step where it leaves must be negative, implying $t$ should have entered in that step by Bland’s rule - a contradiction. Therefore no cycling is possible. $\square$

In practice, Bland’s rule is rarely used because it can be slow on noncycling problems. The lexicographic pivot rule and the perturbation method are alternative anti-cycling strategies that are more commonly implemented.

The Two-Phase Method

When no natural initial BFS is available - because $A$ does not contain an embedded identity matrix after slack addition - the two-phase method finds one.

Phase I: find a feasible basis.

Introduce artificial variables $a_i \geq 0$ for each constraint $i$:

$$Bx_B + Nx_N + Ia = b$$

Minimize the auxiliary objective $\sum_{i=1}^m a_i$ (the $\ell^1$ infeasibility measure) starting from the obvious BFS: $a = b \geq 0$ (assuming $b \geq 0$; rows with $b_i < 0$ are negated).

If the Phase I optimum is 0, all artificial variables are zero at optimality and we have a feasible basis $B$ for the original problem. If the Phase I optimum is $> 0$, the original LP is infeasible (there is no $x$ with $Ax = b$, $x \geq 0$).

Phase II: optimize the original objective.

Use the feasible basis from Phase I as the starting point for the standard simplex applied to the original objective $c^\top x$. The artificial variables can be dropped from the tableau (or kept with very large cost to prevent re-entry).

One subtlety: if some artificial variable remains in the basis at zero value at the end of Phase I, the basis is degenerate. The artificial variable can be pivoted out (replaced by a real variable) without changing the objective, or carried along. In the latter case, Phase II must be careful not to let the artificial variable become positive.

Example scenario. Suppose we have the LP: $\min x_1 + x_2$ s.t. $x_1 + x_2 = 3$, $x_1, x_2 \geq 0$. No slack is available (equality constraint, and $b_1 = 3 > 0$). Phase I adds $a_1 \geq 0$: minimize $a_1$ s.t. $x_1 + x_2 + a_1 = 3$, $x_1, x_2, a_1 \geq 0$. Initial BFS: $a_1 = 3$. Pivot: $a_1 = -3$ reduced cost drives $x_1$ or $x_2$ into the basis. At Phase I optimum, $a_1 = 0$ and the basis contains a real variable with value 3. Phase II then minimizes $x_1 + x_2 = 3$ over the feasible set, finding any convex combination of $(3,0)$ and $(0,3)$ optimal.

LP Duality

Every linear program has a paired dual problem. The relationship between primal and dual is one of the most elegant results in optimization.

Primal - Dual pair. The standard-form primal and its dual are:

$$\text{Primal (P):} \quad \min c^\top x ;; \text{s.t.} ;; Ax = b,; x \geq 0$$

$$\text{Dual (D):} \quad \max y^\top b ;; \text{s.t.} ;; A^\top y \leq c$$

The dual variable $y \in \mathbb{R}^m$ (one per primal equality constraint) is unrestricted in sign.

Derivation. For any $y$, multiplying the primal constraint $Ax = b$ by $y^\top$ gives $y^\top Ax = y^\top b$. If $A^\top y \leq c$, then $c^\top x \geq y^\top Ax = y^\top b$ for all primal feasible $x \geq 0$. So any dual feasible $y$ provides a lower bound on the primal objective.

Weak Duality Theorem. For any primal feasible $x$ and dual feasible $y$:

$$c^\top x \geq y^\top b$$

Proof. $c^\top x \geq (A^\top y)^\top x = y^\top Ax = y^\top b$. The first inequality uses $c \geq A^\top y$ and $x \geq 0$; the last equality uses $Ax = b$. $\square$

Weak duality immediately gives: if the primal objective equals any dual objective value, both are optimal. It also gives: if primal is unbounded below, dual is infeasible; if dual is unbounded above, primal is infeasible.

Strong Duality Theorem. If both primal and dual are feasible, then their optimal values are equal: there is no duality gap.

Proof sketch. At the optimal primal BFS with basis $B$, define $y^{\ast} = (B^{-1})^\top c_B = (B^\top)^{-1} c_B$. Then:

Dual feasibility: $(A^\top y^{\ast})_{j} = a_{j}^\top y^{\ast} = c_B^\top B^{-1} a_{j}$. For basic $j \in B$: $c_B^\top B^{-1}a_j = c_j$ (since $B^{-1}a_j = e_i$ for the corresponding basis column). For nonbasic $j \notin B$: $c_B^\top B^{-1}a_j = c_j - \bar{c}_j \leq c_j$ since $\bar{c}_j \geq 0$ by optimality of the primal BFS. So $A^\top y^{\ast} \leq c$, confirming $y^{\ast}$ is dual feasible.
Equal objectives: $(y^{\ast})^\top b = c_B^\top B^{-1} b = c_B^\top \bar{b} = c^\top x^{\ast}$.

So the dual value $(y^{\ast})^\top b$ equals the primal value $c^\top x^{\ast}$. By weak duality, both are optimal. $\square$

The dual variables $y^{\ast} = (B^\top)^{-1} c_B$ are called shadow prices: $y_{i}^{\ast}$ is the rate of change of the optimal objective with respect to $b_i$.

Complementary Slackness. Optimal primal $x^{\ast}$ and dual $y^{\ast}$ must satisfy:

$$x_{j}^{\ast} (c_{j} - (A^\top y^{\ast})_{j}) = 0 \quad \text{for all } j$$

$$y_{i}^{\ast} (A_{i} x^{\ast} - b_{i}) = 0 \quad \text{for all } i$$

In words: if $x_j^{\ast} > 0$ (a primal variable is positive), then the corresponding dual constraint must be tight ($c_j = (A^\top y^{\ast})_{j}$). If a dual constraint is strict ($c_j > (A^\top y^{\ast})_{j}$), then $x_j^{\ast} = 0$. For standard form LPs, $Ax^{\ast} = b$ always, so the second condition is automatic. The first condition says the reduced cost $\bar{c}_j = c_j - (y^{\ast})^\top a_j$ is zero for every basic variable - exactly the optimality condition for the primal simplex.

Proof. Strong duality gives $c^\top x^{\ast} = (y^{\ast})^\top b = (y^{\ast})^\top Ax^{\ast} = (A^\top y^{\ast})^\top x^{\ast}$. So $(c - A^\top y^{\ast})^\top x^{\ast} = 0$. Since $c - A^\top y^{\ast} \geq 0$ (dual feasibility) and $x^{\ast} \geq 0$, each term in the sum must be zero: $x_{j}^{\ast}(c_{j} - (A^\top y^{\ast})_{j}) = 0$ for all $j$. $\square$

Complementary slackness is practically useful: given a candidate primal solution, use complementarity to read off the required dual solution and check feasibility directly - a faster route to optimality verification than re-running the simplex.

Summary

Concept	Key Statement
Standard form	$\min c^\top x$ s.t. $Ax = b$, $x \geq 0$; any LP converts to this
BFS	Set $x_N = 0$, $x_B = B^{-1}b \geq 0$ for a basis $B$
BFS = extreme point	The two notions coincide exactly (proved above)
Reduced cost	$\bar{c}_j = c_j - c_B^\top B^{-1} a_j$; measures edge slope
Optimality condition	Current BFS is optimal iff all $\bar{c}_j \geq 0$
Simplex pivot	Enter: most negative $\bar{c}_j$; leave: minimum ratio test
Unboundedness	Entering column $\bar{a}_j \leq 0$ means objective is $-\infty$
Bland’s rule	Smallest-index entering and leaving; prevents cycling
Two-phase method	Phase I minimizes $\sum a_i$ to find a feasible basis
Weak duality	$c^\top x \geq y^\top b$ for any primal and dual feasible pair
Strong duality	Optimal values are equal when both are feasible
Complementary slackness	$x_{j}^{\ast}(c_{j} - A_{j}^\top y^{\ast}) = 0$ for all $j$ at optimality

Read next: