Vai al contenuto

The Simplex Method

The Story Behind the Math

In 1947, George Dantzig (1914-2005), a young mathematician working for the U.S. Air Force, faced a daunting challenge: how to optimally allocate resources—planes, personnel, fuel—across thousands of missions while respecting countless constraints. The problem was far too large to solve by trial and error.

Dantzig realized that these allocation problems could be expressed as linear programs: maximize (or minimize) a linear objective function subject to linear constraints. But solving them efficiently required a new algorithm. One evening, while trying to find an optimal diet for the military (minimizing cost while meeting nutritional requirements), the breakthrough came.

Dantzig noticed something remarkable: the optimal solution always lies at a "corner" (vertex) of the feasible region. Instead of searching everywhere, he could jump from vertex to vertex, always improving the objective, until reaching the optimum. This was the birth of the Simplex Method.

The controversy: Computer scientists later discovered that the Simplex Method could, in rare worst-case scenarios, visit exponentially many vertices. Yet in practice, it solves real-world problems with stunning efficiency. This "algorithmic mystery" puzzled researchers for decades. It wasn't until 1979 that Leonid Khachiyan proved linear programming could be solved in polynomial time with his ellipsoid method, and later Narendra Karmarkar (1984) developed interior-point methods that were both theoretically efficient and practically competitive with Simplex.

But the Simplex Method remains the workhorse of optimization—elegant, intuitive, and incredibly fast for most real problems.

Why It Matters

The Simplex Method powers: - Supply chain optimization — FedEx, Amazon route billions of packages - Portfolio optimization — Wall Street balances risk vs. return - Manufacturing — Toyota minimizes waste in production lines - Energy grids — Real-time electricity distribution - Agriculture — Optimal crop mixing for profit - Healthcare — Hospital staff scheduling - Game theory — Nash equilibrium computation

Every time you get a Google Maps route, every time Netflix recommends a movie, optimization algorithms (often descended from Simplex) are working behind the scenes.

Prerequisites

  • Linear-Programming — Problem formulation
  • Systems of linear equations
  • Basic linear algebra (matrix notation)
  • Geometry of convex polyhedra (vertices, edges, faces)
  • Farkas-Lemma — Foundation for understanding why it works

The Core Insight

The Geometric View

Imagine the feasible region as a multi-dimensional polygon (a polytope) defined by the constraints: - In 2D: a polygon - In 3D: a polyhedron - In n-dimensions: a polytope

Key insight #1: The optimal solution always lies at a vertex (corner) of this polytope.

Key insight #2: Two vertices are connected by an edge if they share all but one constraint.

Key insight #3: Moving along an edge changes the objective function linearly. If we move to a better vertex, we keep going. If no adjacent vertex is better, we're at the optimum!

The Algebraic View

A linear program in standard form:

\[ \begin{aligned} \text{maximize} \quad & c^T x \\ \text{subject to} \quad & Ax = b \\ & x \geq 0 \end{aligned} \]

Where: - \(x \in \mathbb{R}^n\) is the decision variable - \(A\) is an \(m \times n\) constraint matrix (with \(m < n\)) - \(b \in \mathbb{R}^m\) is the right-hand side - \(c \in \mathbb{R}^n\) is the objective coefficient

Basic variables: Choose \(m\) columns of \(A\) to form a basis \(B\). Set the remaining \(n-m\) variables (non-basic) to 0.

Basic feasible solution: Solve \(Bx_B = b\) for \(x_B\). If \(x_B \geq 0\), this is a vertex!

The pivot: Swap one basic variable with one non-basic variable → move to an adjacent vertex.

Derivation: Why It Works

The Fundamental Theorem of Linear Programming

Theorem: If an optimal solution exists, there exists an optimal solution at a vertex of the feasible region.

Proof intuition: 1. The feasible region is convex (any line segment between feasible points stays feasible) 2. The objective function is linear (its level sets are hyperplanes) 3. Moving in a direction that improves the objective eventually hits a boundary 4. Once on a boundary, you can slide to a vertex without worsening the objective 5. Therefore, at least one vertex is optimal

This is why we only need to check vertices!

The Optimality Condition

At a vertex defined by basis \(B\), we can express everything in terms of basic variables:

\[ x_B = B^{-1}b - B^{-1}N x_N \]

The objective becomes:

\[ c^T x = c_B^T x_B + c_N^T x_N = c_B^T B^{-1}b + (c_N^T - c_B^T B^{-1}N) x_N \]

The term \((c_N^T - c_B^T B^{-1}N)\) is called the reduced cost.

Optimality: If all reduced costs are \(\leq 0\) (for maximization), no non-basic variable can improve the objective when brought into the basis. We're optimal!

The Pivot Selection

If some reduced cost \(r_j > 0\), bringing variable \(x_j\) into the basis improves the objective. But how much can we increase it?

Compute the minimum ratio: For each basic variable \(x_i\), find how much we can increase \(x_j\) before \(x_i\) becomes negative.

\[ \theta = \min_{i} \left\{ \frac{(B^{-1}b)_i}{(B^{-1}A_j)_i} : (B^{-1}A_j)_i > 0 \right\} \]

The basic variable that hits zero first leaves the basis.

The Algorithm: Step-by-Step

Step 1: Initialization

  • Convert to standard form (slack/surplus variables)
  • Find an initial basic feasible solution (often using Phase I or Big-M method)

Step 2: Check Optimality

  • Compute reduced costs: \(r_N = c_N^T - c_B^T B^{-1}N\)
  • If \(r_N \leq 0\) (max), STOP — optimal!
  • Otherwise, select entering variable: most positive reduced cost

Step 3: Check Unboundedness

  • If the entering column has no positive entries in \(B^{-1}N\), the problem is unbounded
  • You can increase the variable forever while staying feasible

Step 4: Pivot

  • Compute ratios to find the leaving variable
  • Perform row operations to pivot: entering column becomes a unit vector
  • Update basis
  • Return to Step 2

Worked Example

Problem: Maximize \(z = 3x_1 + 2x_2\) subject to:

\[ \begin{aligned} x_1 + x_2 &\leq 4 \\ x_1 - x_2 &\leq 2 \\ x_1, x_2 &\geq 0 \end{aligned} \]

Step 1: Add slack variables \(s_1, s_2\):

\[ \begin{aligned} \text{maximize} \quad & z = 3x_1 + 2x_2 \\ \text{subject to} \quad & x_1 + x_2 + s_1 = 4 \\ & x_1 - x_2 + s_2 = 2 \\ & x_1, x_2, s_1, s_2 \geq 0 \end{aligned} \]

Initial Tableau:

\[ \begin{array}{c|ccccc|c} & x_1 & x_2 & s_1 & s_2 & z & \text{RHS} \\ \hline s_1 & 1 & 1 & 1 & 0 & 0 & 4 \\ s_2 & 1 & -1 & 0 & 1 & 0 & 2 \\ \hline z & -3 & -2 & 0 & 0 & 1 & 0 \end{array} \]

Iteration 1: - Most negative coefficient in \(z\)-row: \(x_1\) (coefficient -3) - Ratios: \(4/1 = 4\), \(2/1 = 2\) → minimum is 2 - Pivot on row 2, column \(x_1\)

After Pivoting:

\[ \begin{array}{c|ccccc|c} & x_1 & x_2 & s_1 & s_2 & z & \text{RHS} \\ \hline s_1 & 0 & 2 & 1 & -1 & 0 & 2 \\ x_1 & 1 & -1 & 0 & 1 & 0 & 2 \\ \hline z & 0 & -5 & 0 & 3 & 1 & 6 \end{array} \]

Iteration 2: - Most negative: \(x_2\) (coefficient -5) - Ratio: only row 1 has positive coefficient (2/2 = 1) - Pivot on row 1, column \(x_2\)

Final Tableau:

\[ \begin{array}{c|ccccc|c} & x_1 & x_2 & s_1 & s_2 & z & \text{RHS} \\ \hline x_2 & 0 & 1 & 0.5 & -0.5 & 0 & 1 \\ x_1 & 1 & 0 & 0.5 & 0.5 & 0 & 3 \\ \hline z & 0 & 0 & 2.5 & 0.5 & 1 & 11 \end{array} \]

Solution: All reduced costs in \(z\)-row are non-negative. Optimal! - \(x_1 = 3\), \(x_2 = 1\), \(z = 11\)

Geometric interpretation: We moved from vertex (0,0) → (2,0) → (3,1), improving the objective at each step.

Understanding the Structure

Why vertices? - The feasible region is a convex polytope - Linear objective has level sets (hyperplanes) - Moving toward better objective hits a boundary - Boundaries are where constraints are tight (vertices)

Why edges? - Adjacent vertices differ by exactly one tight constraint - Moving along an edge means relaxing one constraint while tightening another - This corresponds to swapping one basic and one non-basic variable

Why the tableau? - Encodes the system \(Ax = b\) in a convenient form - Row operations correspond to Gaussian elimination - Each tableau represents a different vertex - The \(z\)-row tracks reduced costs (optimality conditions)

Key Properties

  • Finite termination: Visits at most \(\binom{n}{m}\) vertices (though usually far fewer)
  • Degeneracy: Can cycle in theory, but anti-cycling rules (Bland's rule) prevent this
  • Exponential worst-case: Klee-Minty cube shows Simplex can visit \(2^n - 1\) vertices
  • Practical efficiency: Typically \(O(m \log n)\) iterations for real problems
  • Duality: Simplex solves primal and dual simultaneously (seen in final tableau)

Common Issues

Issue Cause Solution
Degeneracy Multiple vertices coincide Bland's rule or perturbation
Cycling Infinite loop at degenerate vertex Smallest subscript rule
Infeasible start No obvious initial BFS Phase I (auxiliary problem)
Unbounded No constraint stops improvement Check entering column

The Dual Connection

The Simplex Method actually solves both the primal and dual problems: - The \(z\)-row coefficients for slack variables give the dual solution - When primal is optimal, the dual is too (strong duality) - This is the foundation of sensitivity analysis—how much can parameters change without affecting optimality?

Modern Variants

Revised Simplex: Don't store entire tableau; compute \(B^{-1}\) explicitly. More efficient for sparse problems.

Dual Simplex: Start with optimal but infeasible solution; restore feasibility. Great for adding new constraints.

Primal-Dual Methods: Maintain both primal and dual feasibility; converge to optimality.

Interior-Point Methods: Walk through the interior of the feasible region (Karmarkar's algorithm). Polynomial time, competitive with Simplex for large problems.

  • Linear-Programming — Problem formulation
  • Farkas-Lemma — Theoretical foundation
  • Duality Theory — Primal-dual relationships
  • Sensitivity Analysis — Post-optimality analysis
  • Network Flows — Specialized Simplex for graphs
  • Integer Programming — When variables must be whole numbers (much harder!)

References

  • Dantzig, G. B. (1947). "Maximization of a Linear Function of Variables Subject to Linear Inequalities." Activity Analysis of Production and Allocation.
  • Dantzig, G. B. (1963). Linear Programming and Extensions. Princeton University Press.
  • Klee, V., & Minty, G. J. (1972). "How Good is the Simplex Algorithm?" Inequalities III.
  • Khachiyan, L. G. (1979). "A Polynomial Algorithm in Linear Programming." Soviet Mathematics Doklady.
  • Karmarkar, N. (1984). "A New Polynomial-Time Algorithm for Linear Programming." Combinatorica.

Visual Guide

The following figures illustrate the geometric interpretation of the Simplex Method:

1. Feasible Region and Optimization Path

2D Feasible Region The feasible region (green) with constraints as blue and red lines. The red path shows how Simplex moves from vertex (0,0) → (2,0) → (3,1), improving the objective at each step. The colored dashed lines represent level sets of the objective function z = 3x₁ + 2x₂.

2. 3D Geometric Intuition

3D Intuition In higher dimensions, the feasible region becomes a polyhedron. The Simplex Method moves along edges (blue lines) from vertex to vertex. The red path shows the trajectory from the origin to the optimal vertex, always following the edges of the polyhedron.

3. Tableau Evolution

Tableau Evolution The algebraic representation: how the tableau changes with each pivot operation. Green columns indicate entering variables, pink rows show the leaving variable (pivot row). Each tableau corresponds to a vertex in the geometric view.

4. Geometric vs Algebraic View

Conceptual Comparison Side-by-side comparison showing how the geometric movement along edges corresponds exactly to algebraic basis changes. Moving from one vertex to an adjacent vertex = swapping one basic variable with one non-basic variable.