The Simplex Method¶

The Story Behind the Math¶

In 1947, George Dantzig (1914-2005), a young mathematician working for the U.S. Air Force, faced a daunting challenge: how to optimally allocate resources—planes, personnel, fuel—across thousands of missions while respecting countless constraints. The problem was far too large to solve by trial and error.

Dantzig realized that these allocation problems could be expressed as linear programs: maximize (or minimize) a linear objective function subject to linear constraints. But solving them efficiently required a new algorithm. One evening, while trying to find an optimal diet for the military (minimizing cost while meeting nutritional requirements), the breakthrough came.

Dantzig noticed something remarkable: the optimal solution always lies at a "corner" (vertex) of the feasible region. Instead of searching everywhere, he could jump from vertex to vertex, always improving the objective, until reaching the optimum. This was the birth of the Simplex Method.

The controversy: Computer scientists later discovered that the Simplex Method could, in rare worst-case scenarios, visit exponentially many vertices. Yet in practice, it solves real-world problems with stunning efficiency. This "algorithmic mystery" puzzled researchers for decades. It wasn't until 1979 that Leonid Khachiyan proved linear programming could be solved in polynomial time with his ellipsoid method, and later Narendra Karmarkar (1984) developed interior-point methods that were both theoretically efficient and practically competitive with Simplex.

But the Simplex Method remains the workhorse of optimization—elegant, intuitive, and incredibly fast for most real problems.

Why It Matters¶

The Simplex Method powers: - Supply chain optimization — FedEx, Amazon route billions of packages - Portfolio optimization — Wall Street balances risk vs. return - Manufacturing — Toyota minimizes waste in production lines - Energy grids — Real-time electricity distribution - Agriculture — Optimal crop mixing for profit - Healthcare — Hospital staff scheduling - Game theory — Nash equilibrium computation

Every time you get a Google Maps route, every time Netflix recommends a movie, optimization algorithms (often descended from Simplex) are working behind the scenes.

Prerequisites¶

Linear-Programming — Problem formulation
Systems of linear equations
Basic linear algebra (matrix notation)
Geometry of convex polyhedra (vertices, edges, faces)
Farkas-Lemma — Foundation for understanding why it works

The Core Insight¶

The Geometric View¶

Imagine the feasible region as a multi-dimensional polygon (a polytope) defined by the constraints: - In 2D: a polygon - In 3D: a polyhedron - In n-dimensions: a polytope

Key insight #1: The optimal solution always lies at a vertex (corner) of this polytope.

Key insight #2: Two vertices are connected by an edge if they share all but one constraint.

Key insight #3: Moving along an edge changes the objective function linearly. If we move to a better vertex, we keep going. If no adjacent vertex is better, we're at the optimum!

The Algebraic View¶

A linear program in standard form:

\[ \begin{aligned} \text{maximize} \quad & c^T x \\ \text{subject to} \quad & Ax = b \\ & x \geq 0 \end{aligned} \]

Where: - \(x \in \mathbb{R}^n\) is the decision variable - \(A\) is an \(m \times n\) constraint matrix (with \(m < n\)) - \(b \in \mathbb{R}^m\) is the right-hand side - \(c \in \mathbb{R}^n\) is the objective coefficient

Basic variables: Choose \(m\) columns of \(A\) to form a basis \(B\). Set the remaining \(n-m\) variables (non-basic) to 0.

Basic feasible solution: Solve \(Bx_B = b\) for \(x_B\). If \(x_B \geq 0\), this is a vertex!

The pivot: Swap one basic variable with one non-basic variable → move to an adjacent vertex.

Derivation: Why It Works¶

The Fundamental Theorem of Linear Programming¶

Theorem: If an optimal solution exists, there exists an optimal solution at a vertex of the feasible region.

Proof intuition: 1. The feasible region is convex (any line segment between feasible points stays feasible) 2. The objective function is linear (its level sets are hyperplanes) 3. Moving in a direction that improves the objective eventually hits a boundary 4. Once on a boundary, you can slide to a vertex without worsening the objective 5. Therefore, at least one vertex is optimal

This is why we only need to check vertices!

The Optimality Condition¶

At a vertex defined by basis \(B\), we can express everything in terms of basic variables:

\[ x_B = B^{-1}b - B^{-1}N x_N \]

The objective becomes:

\[ c^T x = c_B^T x_B + c_N^T x_N = c_B^T B^{-1}b + (c_N^T - c_B^T B^{-1}N) x_N \]

The term \((c_N^T - c_B^T B^{-1}N)\) is called the reduced cost.

Optimality: If all reduced costs are \(\leq 0\) (for maximization), no non-basic variable can improve the objective when brought into the basis. We're optimal!

The Pivot Selection¶

If some reduced cost \(r_j > 0\), bringing variable \(x_j\) into the basis improves the objective. But how much can we increase it?

Compute the minimum ratio: For each basic variable \(x_i\), find how much we can increase \(x_j\) before \(x_i\) becomes negative.

\[ \theta = \min_{i} \left\{ \frac{(B^{-1}b)_i}{(B^{-1}A_j)_i} : (B^{-1}A_j)_i > 0 \right\} \]

The basic variable that hits zero first leaves the basis.

The Algorithm: Step-by-Step¶

Step 1: Initialization¶

Convert to standard form (slack/surplus variables)
Find an initial basic feasible solution (often using Phase I or Big-M method)

Step 2: Check Optimality¶

Compute reduced costs: \(r_N = c_N^T - c_B^T B^{-1}N\)
If \(r_N \leq 0\) (max), STOP — optimal!
Otherwise, select entering variable: most positive reduced cost

Step 3: Check Unboundedness¶

If the entering column has no positive entries in \(B^{-1}N\), the problem is unbounded
You can increase the variable forever while staying feasible

Step 4: Pivot¶

Compute ratios to find the leaving variable
Perform row operations to pivot: entering column becomes a unit vector
Update basis
Return to Step 2

Worked Example¶

Problem: Maximize \(z = 3x_1 + 2x_2\) subject to:

\[ \begin{aligned} x_1 + x_2 &\leq 4 \\ x_1 - x_2 &\leq 2 \\ x_1, x_2 &\geq 0 \end{aligned} \]

Step 1: Add slack variables \(s_1, s_2\):

\[ \begin{aligned} \text{maximize} \quad & z = 3x_1 + 2x_2 \\ \text{subject to} \quad & x_1 + x_2 + s_1 = 4 \\ & x_1 - x_2 + s_2 = 2 \\ & x_1, x_2, s_1, s_2 \geq 0 \end{aligned} \]

Initial Tableau:

\[ \begin{array}{c|ccccc|c} & x_1 & x_2 & s_1 & s_2 & z & \text{RHS} \\ \hline s_1 & 1 & 1 & 1 & 0 & 0 & 4 \\ s_2 & 1 & -1 & 0 & 1 & 0 & 2 \\ \hline z & -3 & -2 & 0 & 0 & 1 & 0 \end{array} \]

Iteration 1: - Most negative coefficient in \(z\)-row: \(x_1\) (coefficient -3) - Ratios: \(4/1 = 4\), \(2/1 = 2\) → minimum is 2 - Pivot on row 2, column \(x_1\)

After Pivoting:

\[ \begin{array}{c|ccccc|c} & x_1 & x_2 & s_1 & s_2 & z & \text{RHS} \\ \hline s_1 & 0 & 2 & 1 & -1 & 0 & 2 \\ x_1 & 1 & -1 & 0 & 1 & 0 & 2 \\ \hline z & 0 & -5 & 0 & 3 & 1 & 6 \end{array} \]

Iteration 2: - Most negative: \(x_2\) (coefficient -5) - Ratio: only row 1 has positive coefficient (2/2 = 1) - Pivot on row 1, column \(x_2\)

Final Tableau:

\[ \begin{array}{c|ccccc|c} & x_1 & x_2 & s_1 & s_2 & z & \text{RHS} \\ \hline x_2 & 0 & 1 & 0.5 & -0.5 & 0 & 1 \\ x_1 & 1 & 0 & 0.5 & 0.5 & 0 & 3 \\ \hline z & 0 & 0 & 2.5 & 0.5 & 1 & 11 \end{array} \]

Solution: All reduced costs in \(z\)-row are non-negative. Optimal! - \(x_1 = 3\), \(x_2 = 1\), \(z = 11\)

Geometric interpretation: We moved from vertex (0,0) → (2,0) → (3,1), improving the objective at each step.

Understanding the Structure¶

Why vertices? - The feasible region is a convex polytope - Linear objective has level sets (hyperplanes) - Moving toward better objective hits a boundary - Boundaries are where constraints are tight (vertices)

Why edges? - Adjacent vertices differ by exactly one tight constraint - Moving along an edge means relaxing one constraint while tightening another - This corresponds to swapping one basic and one non-basic variable

Why the tableau? - Encodes the system \(Ax = b\) in a convenient form - Row operations correspond to Gaussian elimination - Each tableau represents a different vertex - The \(z\)-row tracks reduced costs (optimality conditions)

Key Properties¶

Finite termination: Visits at most \(\binom{n}{m}\) vertices (though usually far fewer)
Degeneracy: Can cycle in theory, but anti-cycling rules (Bland's rule) prevent this
Exponential worst-case: Klee-Minty cube shows Simplex can visit \(2^n - 1\) vertices
Practical efficiency: Typically \(O(m \log n)\) iterations for real problems
Duality: Simplex solves primal and dual simultaneously (seen in final tableau)

Common Issues¶

Issue	Cause	Solution
Degeneracy	Multiple vertices coincide	Bland's rule or perturbation
Cycling	Infinite loop at degenerate vertex	Smallest subscript rule
Infeasible start	No obvious initial BFS	Phase I (auxiliary problem)
Unbounded	No constraint stops improvement	Check entering column

The Dual Connection¶

The Simplex Method actually solves both the primal and dual problems: - The \(z\)-row coefficients for slack variables give the dual solution - When primal is optimal, the dual is too (strong duality) - This is the foundation of sensitivity analysis—how much can parameters change without affecting optimality?

Modern Variants¶

Revised Simplex: Don't store entire tableau; compute \(B^{-1}\) explicitly. More efficient for sparse problems.

Dual Simplex: Start with optimal but infeasible solution; restore feasibility. Great for adding new constraints.

Primal-Dual Methods: Maintain both primal and dual feasibility; converge to optimality.

Interior-Point Methods: Walk through the interior of the feasible region (Karmarkar's algorithm). Polynomial time, competitive with Simplex for large problems.

Linear-Programming — Problem formulation
Farkas-Lemma — Theoretical foundation
Duality Theory — Primal-dual relationships
Sensitivity Analysis — Post-optimality analysis
Network Flows — Specialized Simplex for graphs
Integer Programming — When variables must be whole numbers (much harder!)

References¶

Dantzig, G. B. (1947). "Maximization of a Linear Function of Variables Subject to Linear Inequalities." Activity Analysis of Production and Allocation.
Dantzig, G. B. (1963). Linear Programming and Extensions. Princeton University Press.
Klee, V., & Minty, G. J. (1972). "How Good is the Simplex Algorithm?" Inequalities III.
Khachiyan, L. G. (1979). "A Polynomial Algorithm in Linear Programming." Soviet Mathematics Doklady.
Karmarkar, N. (1984). "A New Polynomial-Time Algorithm for Linear Programming." Combinatorica.

Visual Guide¶

The following figures illustrate the geometric interpretation of the Simplex Method:

1. Feasible Region and Optimization Path¶

2D Feasible Region The feasible region (green) with constraints as blue and red lines. The red path shows how Simplex moves from vertex (0,0) → (2,0) → (3,1), improving the objective at each step. The colored dashed lines represent level sets of the objective function z = 3x₁ + 2x₂.

2. 3D Geometric Intuition¶

3D Intuition In higher dimensions, the feasible region becomes a polyhedron. The Simplex Method moves along edges (blue lines) from vertex to vertex. The red path shows the trajectory from the origin to the optimal vertex, always following the edges of the polyhedron.

3. Tableau Evolution¶

Tableau Evolution The algebraic representation: how the tableau changes with each pivot operation. Green columns indicate entering variables, pink rows show the leaving variable (pivot row). Each tableau corresponds to a vertex in the geometric view.

4. Geometric vs Algebraic View¶

Conceptual Comparison Side-by-side comparison showing how the geometric movement along edges corresponds exactly to algebraic basis changes. Moving from one vertex to an adjacent vertex = swapping one basic variable with one non-basic variable.