The Simplex Algorithm for Linear Programming

Author

Your Name

Published

January 30, 2025

Introduction

This lecture provides a high-level description of the Simplex Algorithm, a method used to solve linear programming problems (LPs). We will avoid delving too deeply into the technical details, focusing instead on the fundamental concepts.

Overview and Historical Context

The Simplex Algorithm, developed by George Dantzig in the 1950s, is a cornerstone algorithm in optimization, particularly for linear programming. It marked a significant advancement in Operations Research and provided a systematic approach to solve a wide array of problems that could be formulated linearly. Its enduring relevance is underscored by its continued use in practical applications and commercial solvers, despite the existence of theoretically faster algorithms.

Practical vs. Theoretical Performance

The Simplex Algorithm’s efficiency in practice contrasts with its theoretical worst-case exponential time complexity. Linear Programming problems are known to be solvable in polynomial time, meaning algorithms exist whose runtime is bounded by a polynomial function of the input size. However, the Simplex Algorithm, in contrived worst-case scenarios, can take exponential time. These worst-case instances are specifically designed and rarely occur in practical applications. This discrepancy explains why, for average-case scenarios and real-world problems, the Simplex Algorithm often outperforms theoretically superior polynomial-time algorithms in terms of speed and efficiency.

Geometric and Algebraic Interpretations

The Simplex Algorithm can be understood through two complementary lenses: geometric and algebraic.

Geometric Interpretation:

Geometrically, the Simplex Algorithm operates by traversing the vertices of the feasible region, which is a polyhedron. It starts at a vertex and iteratively moves to adjacent vertices that improve the objective function value. This process continues until an optimal vertex is reached, or it is determined that the problem is unbounded. The algorithm’s steps can be visualized as walking along the edges of the polyhedron.

Algebraic Interpretation:

Algebraically, the Simplex Algorithm manipulates the constraint matrix through a process called pivoting. This involves changing the basis of the system of equations that define the feasible region. The algorithm iteratively transforms the matrix to find a solution that optimizes the objective function. This manipulation involves updating columns of the matrix to reflect movements from one vertex to another in the feasible region. The algebraic operations are designed to mirror the geometric movement along the edges of the polyhedron.

We will explore both interpretations to provide a comprehensive understanding of the Simplex Algorithm. The geometric view offers intuition, while the algebraic view details the computational steps.

Linear Programming in Standard Form

Definition of Standard Form

For the Simplex Algorithm, we typically assume the linear programming problem is in standard form. A linear program in standard form is expressed as:

\[\begin{aligned} \text{Maximize } & c^T x \\ \text{subject to } & Ax = b \\ & x \geq 0 \end{aligned}\] where:

$x$ is the vector of variables to be determined.
$c$ is the vector of objective function coefficients.
$A$ is the constraint matrix.
$b$ is the vector of right-hand sides of the constraints.
The constraints $Ax = b$ are equations.
The constraints $x \geq 0$ are non-negativity constraints for all variables.

It is important to note that any linear programming problem, regardless of its initial form (including inequalities, free variables, etc.), can be converted into this standard form without loss of generality. This conversion typically involves introducing slack, surplus, and artificial variables.

Assumptions on the Constraint Matrix

We make the following assumptions about the constraint matrix $A$:

Let $A$ be an $m \times d$ matrix, where $m$ is the number of constraints (equations) and $d$ is the number of variables.
We assume that the rows of $A$ are linearly independent. This implies that the rank of matrix $A$ is equal to $m$.
The assumption of linear independence is justified because if there are linearly dependent rows (equations), it means there is redundancy in the constraints.
- If a row is linearly dependent on others, it is either redundant (if the right-hand side is consistent with the dependency) and can be removed, or the system is inconsistent (if the right-hand side is not consistent with the dependency), implying no solution exists.
- If the rank of $A$ is less than $m$, either the system has no solution, or it contains redundant equations that can be eliminated without changing the feasible region.
For the rank of $A$ to be $m$, the number of columns $d$ must be greater than or equal to the number of rows $m$ ($d \geq m$). If $d < m$, the rank of $A$ can be at most $d < m$, contradicting the assumption of full row rank $m$.

Let $n = d - m$. We can think of $n$ as the number of ‘free’ variables after satisfying the $m$ equality constraints. Thus, the number of variables $d$ can be expressed as $m + n$.

Variables, Constraints, and Dimensions

In summary, for a linear program in standard form $\max \{c^T x : Ax = b, x \geq 0\}$ with an $m \times d$ constraint matrix $A$ of rank $m$:

Number of constraints (equations): $m$.
Number of variables: $d = m + n$, where $n = d - m$.
We assume rows of $A$ are linearly independent, i.e., $\text{rank}(A) = m$.

Geometric Properties of the Feasible Region

Feasible Region as Intersection of Polyhedra

The feasible region of a linear programming problem in standard form is defined by two sets of constraints: $Ax = b$ and $x \geq 0$. We can define these sets separately:

$P_{eq} = \{x \in \mathbb{R}^d : Ax = b\}$: This set represents the solutions to the system of linear equations $Ax = b$. It is an affine subspace of $\mathbb{R}^d$. If $b=0$, it is a linear subspace. For $b \neq 0$, it is a translation of the null space of $A$.
$P_{\geq} = \{x \in \mathbb{R}^d : x \geq 0\}$: This set represents the non-negativity constraints. It is a polyhedral cone (specifically, the non-negative orthant in $\mathbb{R}^d$).

The feasible region $P$ of the linear program is the intersection of these two sets: \[P = P_{eq} \cap P_{\geq} = \{x \in \mathbb{R}^d : Ax = b, x \geq 0\}\] Since both $P_{eq}$ and $P_{\geq}$ are polyhedra, their intersection $P$ is also a polyhedron. Thus, the feasible region of a linear program is always a polyhedron.

Vertices and Extreme Rays

The feasible region $P$ is a polyhedron. Key geometric features of polyhedra are vertices, edges, and faces.

Vertices: Vertices are extreme points of the polyhedron. A polyhedron has vertices if and only if it does not contain a line (a set of the form $\{x + \lambda d : \lambda \in \mathbb{R}\}$ for some $x, d \neq 0\}$).
Existence of Vertices in $P_{\geq}$: The non-negativity constraint set $P_{\geq} = \{x \in \mathbb{R}^d : x \geq 0\}$ definitely has at least one vertex, which is the origin ($x = 0$). Furthermore, $P_{\geq}$ does not contain any line, as the non-negativity constraints restrict movement in certain directions.
Vertices in $P = P_{eq} \cap P_{\geq}$: Since $P_{\geq}$ has vertices and does not contain a line, their intersection $P = P_{eq} \cap P_{\geq}$ will also have vertices (if non-empty) and will not contain a line. Therefore, the feasible region $P$ of a linear program, if it is not empty, has vertices.

Dimension of the Solution Space

The dimension of the feasible region $P$ can be determined.

Dimension of $P_{eq}$: The set $P_{eq} = \{x \in \mathbb{R}^d : Ax = b\}$ is the solution set to a system of $m$ linearly independent equations in $d$ variables. The dimension of this affine subspace is $d - m = n$.
Dimension of $P = P_{eq} \cap P_{\geq}$: Intersecting $P_{eq}$ with $P_{\geq}$ (which are inequalities) can further reduce the dimension or create facets, edges, and vertices, but it does not increase the dimension beyond that of $P_{eq}$. Therefore, the dimension of the feasible region $P$ is at most $n = d - m$. It is a polyhedron of dimension at most $n$.
Full Dimension in Affine Subspace: The polyhedron $P$ is of dimension at most $n$ in $\mathbb{R}^d$. However, within the affine subspace $P_{eq}$ (which has dimension $n$), $P$ can be considered to have full dimension if it is non-empty and bounded within $P_{eq}$.
Analogy: Imagine a 2D square in a 3D space. The square is a 2D object (full dimension in 2D), but it is embedded in a 3D space. Similarly, our feasible region $P$ of dimension at most $n$ is embedded in $\mathbb{R}^d$.

Optimality and Unboundedness

Unboundedness and Extreme Rays

When solving a linear programming problem, we can encounter three scenarios:

Infeasibility: The feasible region $P$ is empty. In this case, no solution exists that satisfies all constraints. The Simplex Algorithm can detect infeasibility.
Unboundedness: The objective function can be increased indefinitely within the feasible region. In this case, there is no optimal solution, but rather solutions with arbitrarily large objective values.
Optimality: There exists at least one optimal solution that maximizes (or minimizes) the objective function within the feasible region. There might be a unique optimal solution or multiple optimal solutions.

A key theorem related to unboundedness states:

A linear programming problem is unbounded if and only if there exists a vertex $v$ of the feasible region $P$ and an extreme ray $r$ emanating from $v$ such that moving along this ray increases the objective function value indefinitely. An extreme ray is essentially an edge of the polyhedron that extends to infinity.

An extreme ray $r$ is a direction such that for any $\lambda \geq 0$, $v + \lambda r$ remains in the feasible region $P$, and the objective function $c^T (v + \lambda r) = c^T v + \lambda c^T r$ increases as $\lambda \rightarrow \infty$ (i.e., $c^T r > 0$ for maximization problems). In this case, $r$ is called an improving extreme ray.

In the figure above, if $x$ is a vertex and $R$ represents an extreme ray, and if the objective function direction $c$ is such that moving along $R$ increases $c^Tx$, then the problem is unbounded. The other edges emanating from $x$ might lead to lower or equal objective values.

Existence of an Optimal Vertex Solution

A fundamental result in linear programming is that if an optimal solution exists, then there exists an optimal solution that is a vertex of the feasible region.

If a linear programming problem has an optimal solution, then it has an optimal solution that is a vertex of the feasible region $P$.

Proof. Intuitive Explanation. Suppose an optimal solution $x^*$ exists, and let $C_0 = c^T x^*$ be the optimal objective value. Consider the set of all optimal solutions $F = \{x \in P : c^T x = C_0\}$. This set $F$ is a face of the polyhedron $P$ (possibly $P$ itself, or a vertex, edge, etc.). Since $P$ has vertices (if it is not empty and bounded in at least some directions), and $F$ is a face of $P$, $F$ must also have vertices. Every vertex of $F$ is also a vertex of $P$. Since all points in $F$ are optimal, any vertex of $F$ is an optimal vertex of $P$.

In the left figure, the objective function $c$ direction leads to a unique optimal vertex. In the right figure, with $c$ pointing vertically, the entire top face $F$ is optimal, and its vertices are also optimal vertices of $P$.

This theorem is crucial because it reduces the search for an optimal solution from the entire feasible region (which can be continuous and infinite) to a finite set of vertices. ◻

The Simplex Algorithm’s Search Approach

The Simplex Algorithm leverages the Optimal Vertex Theorem. It efficiently searches for an optimal solution by:

Vertex Exploration: The algorithm explores the vertices of the feasible region $P$.
Iterative Movement: It starts at an initial feasible vertex and iteratively moves from one vertex to an adjacent vertex along an edge of the polyhedron.
Objective Function Improvement: At each step, it chooses to move to an adjacent vertex that improves (for maximization) or at least does not worsen the objective function value.
Optimality Check: At each vertex, it checks if it is optimal. A vertex is optimal if no adjacent vertex offers a better objective function value. This is based on the convexity of the problem. If a vertex is locally optimal (best among its neighbors), it is globally optimal in a convex problem.
Unboundedness Detection: The algorithm can also detect if the problem is unbounded by identifying an improving extreme ray.

Geometrically, the Simplex Algorithm can be visualized as a walk on the vertices and edges of the polyhedron, always moving in a direction that improves the objective function, until an optimal vertex is found or unboundedness is detected.

Vertices and Basic Feasible Solutions

Characterizing Vertices by Zero Components

If $v$ is a vertex of the feasible region $P \subseteq \mathbb{R}^d$ of dimension $N \leq d-m$, then at least $N$ components of $v$ must be zero. In fact, for our standard form LP, at least $n = d-m$ components of a vertex must be zero.

For a polyhedron of dimension $N$, a vertex is typically formed by the intersection of at least $N$ hyperplanes defining its facets. In our case, some of these facets are given by the non-negativity constraints $x_j \geq 0$, i.e., $x_j = 0$ at the boundary.

Consider a 3-dimensional polyhedron (dimension $N=3$) in $\mathbb{R}^9$ ($d=9$). For a vertex, we expect at least 3 facets to intersect. If these facets are from non-negativity constraints, it means at least 3 variables are zero at a vertex. In fact, in the context of standard form LP, it turns out that for a vertex, at least $n = d-m = 9 - 6 = 3$ (if $m=6$) components must be zero.

Linear Independence of Columns at a Vertex

A more precise characterization of vertices involves the concept of linear independence of columns of the matrix $A$.

Let $v$ be a vertex of the feasible region $P = \{x \in \mathbb{R}^d : Ax = b, x \geq 0\}$. Let $J = \{j : v_j > 0\}$ be the set of indices corresponding to positive components of $v$. Then the columns of $A$ indexed by $J$, i.e., $\{A_j\}_{j \in J}$, are linearly independent.

Proof. Proof. We will prove this by contradiction. Suppose the columns $\{A_j\}_{j \in J}$ are linearly dependent. Then there exist coefficients $\{\lambda_j\}_{j \in J}$, not all zero, such that $\sum_{j \in J} \lambda_j A_j = 0$. Define two vectors $u, w \in \mathbb{R}^d$ as follows:Then $\sum_{j=1}^d u_j A_j = \sum_{j \in J} \lambda_j A_j = 0$ and $\sum_{j=1}^d w_j A_j = \sum_{j \in J} (-\lambda_j) A_j = 0$. Consider vectors $v^+ = v + \epsilon u$ and $v^- = v + \epsilon w = v - \epsilon u$ for a small $\epsilon > 0$. For $j \notin J$, $v_j = 0$, $u_j = w_j = 0$, so $v^+_j = v^-_j = 0 \geq 0$. For $j \in J$, $v_j > 0$. If we choose $\epsilon$ small enough, then $v_j \pm \epsilon \lambda_j > 0$ for all $j \in J$. Thus, for sufficiently small $\epsilon > 0$, $v^+ \geq 0$ and $v^- \geq 0$. Also, $A v^+ = A(v + \epsilon u) = Av + \epsilon Au = b + \epsilon \sum_{j=1}^d u_j A_j = b + \epsilon \cdot 0 = b$. Similarly, $A v^- = A(v - \epsilon u) = Av - \epsilon Au = b - \epsilon \cdot 0 = b$. Hence, $v^+ \in P$ and $v^- \in P$. Furthermore, $v = \frac{1}{2} v^+ + \frac{1}{2} v^-$, since $v^+ + v^- = (v + \epsilon u) + (v - \epsilon u) = 2v$. If $u \neq 0$, then $v^+ \neq v^-$ and $v$ is a convex combination of two distinct points in $P$, which contradicts the assumption that $v$ is a vertex. Therefore, the assumption that $\{A_j\}_{j \in J}$ are linearly dependent must be false, and they must be linearly independent. ◻

Basic Solutions: Degenerate and Non-degenerate

A solution $x \in \mathbb{R}^d$ to $Ax = b$ is called a basic solution if the columns of $A$ corresponding to the non-zero components of $x$ are linearly independent.

From the previous theorem, we know that every vertex of the feasible region is a basic solution.

A basic solution $x \in \mathbb{R}^d$ is called non-degenerate if the number of non-zero components of $x$ is exactly $m$ (which is the rank of $A$).

A basic solution $x \in \mathbb{R}^d$ is called degenerate if the number of non-zero components of $x$ is less than $m$.

If a basic solution is non-degenerate, the $m$ linearly independent columns corresponding to its non-zero components form a basis for the column space of $A$ (assuming rank$(A)=m$).

Bijective Correspondence: Vertices and Basic Solutions

We have shown that every vertex is a basic feasible solution. The converse is also true:

Every basic feasible solution is a vertex of the feasible region $P$.

Proof. Proof. Let $v$ be a basic feasible solution. Suppose $v = \alpha y + (1-\alpha) w$ for some $y, w \in P$ and $0 < \alpha < 1$. We want to show that $y = w = v$. Let $J = \{j : v_j = 0\}$. For any $j \in J$, $v_j = 0 = \alpha y_j + (1-\alpha) w_j$. Since $\alpha > 0, 1-\alpha > 0$ and $y_j \geq 0, w_j \geq 0$, it must be that $y_j = 0$ and $w_j = 0$ for all $j \in J$. Thus, $y_j = w_j = v_j = 0$ for all $j \in J$. This means that non-zero components of $v, y, w$ can only be in the index set $B = \{j : v_j > 0\}$. From $Ay = b$ and $Aw = b$, we have $\sum_{j=1}^d y_j A_j = b$ and $\sum_{j=1}^d w_j A_j = b$. Subtracting these, we get $\sum_{j=1}^d (y_j - w_j) A_j = 0$. Since $y_j - w_j = 0$ for $j \in J$, we have $\sum_{j \notin J} (y_j - w_j) A_j = 0$, which is $\sum_{j \in B} (y_j - w_j) A_j = 0$. Since $v$ is a basic solution, the columns $\{A_j\}_{j \in B}$ are linearly independent. Therefore, it must be that $y_j - w_j = 0$ for all $j \in B$. Thus, $y_j = w_j$ for all $j \in B$. Since we also have $y_j = w_j = 0 = v_j$ for $j \in J$, it follows that $y_j = w_j = v_j$ for all $j = 1, \dots, d$. Therefore, $y = w = v$. This shows that $v$ cannot be expressed as a convex combination of two distinct points in $P$, so $v$ must be a vertex of $P$. ◻

Combining the two theorems, we have a bijective correspondence:

A vector $v \in \mathbb{R}^d$ is a vertex of the feasible region $P$ if and only if $v$ is a basic feasible solution.

This result is fundamental because it links the geometric concept of vertices with the algebraic concept of basic feasible solutions. The Simplex Algorithm exploits this correspondence by searching for an optimal solution among the basic feasible solutions.

Algebraic Perspective of the Simplex Method

Basic Feasible Sets and their Costs

We have established that vertices of the feasible region correspond to basic feasible solutions. Now let’s consider the algebraic interpretation of moving between vertices and optimizing the objective function.

For a given basic feasible solution (vertex), we can associate it with a set of columns of the matrix $A$ that are linearly independent and correspond to the non-zero components of the solution. We call such a set a basic set of columns.

Given a basic set of columns $B$ (of indices), we can attempt to express the vector $b$ as a linear combination of these columns with non-negative coefficients. If successful, this gives us a basic feasible solution.

Let $B$ be a set of indices of columns of $A$. Let $A_B$ be the submatrix of $A$ consisting of columns indexed by $B$. If $A_B$ is invertible (which implies $|B| = m$ and the columns are linearly independent, forming a basis for the column space of $A$), then we can solve $A_B x_B = b$ for $x_B = A_B^{-1} b$. Setting $x_j = 0$ for $j \notin B$ gives a basic solution $x$. If $x_B \geq 0$, then $x$ is a basic feasible solution, and thus a vertex.

The cost of a basic feasible solution $x$ is given by $c^T x = \sum_{j \in B} c_j x_j = c_B^T x_B = c_B^T A_B^{-1} b$, where $c_B$ is the vector of cost coefficients corresponding to the basic columns.

Moving Between Vertices: Pivoting

In the Simplex Algorithm, moving from one vertex to an adjacent vertex corresponds to changing the basic set of columns. This change is done through a process called pivoting.

Starting from a basic feasible solution associated with a basic set $B$, we consider moving to an adjacent vertex by:

Entering Variable: Choosing a non-basic variable (index $j \notin B$) to become basic (enter the basis). This choice is typically guided by the reduced cost, which we will discuss later.
Leaving Variable: Determining a basic variable (index $i \in B$) to become non-basic (leave the basis). This is done to maintain feasibility and to move to an adjacent vertex.

The process of exchanging a basic variable with a non-basic variable is called a pivot operation. Algebraically, this corresponds to updating the basis $B$ by removing one index and adding another.

Adjacency of Bases and Vertices

Two vertices are adjacent if they are connected by an edge of the polyhedron. Algebraically, adjacency of vertices is related to the bases associated with them.

For non-degenerate basic feasible solutions (vertices), two vertices are adjacent if their corresponding bases differ by exactly one column. That is, if $B_1$ and $B_2$ are bases corresponding to two adjacent vertices, then $|B_1 \cap B_2| = m-1$. One column leaves the basis and another enters.

In the case of degenerate vertices, the situation can be more complex. Multiple bases can correspond to the same degenerate vertex, and moving from one basis to another might not necessarily lead to a different vertex, but it is still a step in the Simplex Algorithm.

The Simplex Algorithm: Detailed Steps

Bases, Non-bases, and Partitioning

At each iteration, the Simplex Algorithm works with a basis $B$, which is a set of $m$ linearly independent columns of $A$. Let $N$ be the set of indices of the non-basic columns, so $B \cup N = \{1, 2, \dots, d\}$ and $B \cap N = \emptyset$. We partition the matrix $A$ into $[A_B | A_N]$, the variable vector $x$ into $\begin{pmatrix} x_B \\ x_N \end{pmatrix}$, and the cost vector $c$ into $\begin{pmatrix} c_B \\ c_N \end{pmatrix}$.

The system $Ax = b$ can be written as $A_B x_B + A_N x_N = b$. Since $A_B$ is invertible, we can express $x_B$ in terms of $x_N$:The objective function becomes:Let $\bar{b} = A_B^{-1} b$ and $R = A_B^{-1} A_N$. Then $x_B = \bar{b} - R x_N$. Let $\bar{c}_N^T = c_N^T - c_B^T A_B^{-1} A_N = c_N^T - c_B^T R$ be the vector of reduced costs for non-basic variables, and $z_0 = c_B^T A_B^{-1} b = c_B^T \bar{b}$ be the current objective value. Then the objective function is $c^T x = z_0 + \bar{c}_N^T x_N$.

Reduced Costs and Optimality Condition

The vector $\bar{c}_N^T = c_N^T - c_B^T A_B^{-1} A_N$ is called the vector of reduced costs for the non-basic variables. The $j$-th component of $\bar{c}_N$ (for $j \in N$) is given by $\bar{c}_j = c_j - c_B^T A_B^{-1}A_j$.

For a maximization problem, a basic feasible solution $x$ associated with basis $B$ is optimal if and only if all reduced costs for non-basic variables are non-positive, i.e., $\bar{c}_j \leq 0$ for all $j \in N$.

If $\bar{c}_j \leq 0$ for all $j \in N$, increasing any non-basic variable $x_j$ (from 0) will not increase the objective function value (it may decrease or stay the same). Thus, no improvement is possible, and the current basic feasible solution is optimal.

Entering and Leaving Variables (Pivoting)

If there exists some $j \in N$ such that $\bar{c}_j > 0$, then it is possible to increase the objective function by increasing the non-basic variable $x_j$. We choose such a variable $x_j$ to be the entering variable. A common rule is to choose the variable with the most positive reduced cost (largest $\bar{c}_j$).

Once we choose an entering variable $x_j$ (index $j = e$), we need to determine the leaving variable. We increase $x_e$ from 0 while keeping other non-basic variables at 0 ($x_k = 0$ for $k \in N, k \neq e$). Then $x_B = \bar{b} - R x_N$ becomes $x_B = \bar{b} - A_B^{-1} A_e x_e = \bar{b} - y_e x_e$, where $y_e = A_B^{-1} A_e$ is the $e$-th column of $R = A_B^{-1} A_N$. We need to ensure that $x_B = \bar{b} - y_e x_e \geq 0$ and $x_e \geq 0$. Since we are increasing $x_e$ from 0, we need to find the maximum value of $x_e$ such that $x_B \geq 0$. For each $i \in B$, we have $x_i = \bar{b}_i - y_{ie} x_e \geq 0$.

If $y_{ie} \leq 0$, then $\bar{b}_i - y_{ie} x_e \geq 0$ for all $x_e \geq 0$ (if $\bar{b}_i \geq 0$).
If $y_{ie} > 0$, then we need $x_e \leq \frac{\bar{b}_i}{y_{ie}}$.

We need to choose $x_e$ such that $x_e \leq \frac{\bar{b}_i}{y_{ie}}$ for all $i \in B$ with $y_{ie} > 0$. To maximize the increase in $x_e$ while maintaining feasibility, we choose:If all $y_{ie} \leq 0$ for all $i \in B$, and there is some $\bar{c}_e > 0$, then we can increase $x_e$ indefinitely without violating feasibility, and the objective function is unbounded. In this case, the problem is unbounded.

If $x_e^*$ is finite and positive, let $r$ be an index in $B$ that achieves the minimum ratio, i.e., $x_e^* = \frac{\bar{b}_r}{y_{re}}$. Then the basic variable $x_r$ becomes 0 when $x_e = x_e^*$. We choose $x_r$ (index $r$) as the leaving variable.

Basis Update and Iteration

After determining the entering variable $x_e$ (index $e$) and the leaving variable $x_r$ (index $r$), we update the basis $B$ by removing $r$ and adding $e$. The new basis is $B' = (B \setminus \{r\}) \cup \{e\}$. The new non-basis is $N' = (N \setminus \{e\}) \cup \{r\}$.

We then repeat the process with the new basis $B'$:

Calculate $A_{B'}^{-1}$.
Calculate $\bar{b}' = A_{B'}^{-1} b$.
Calculate reduced costs $\bar{c}_{N'}^T = c_{N'}^T - c_{B'}^T A_{B'}^{-1} A_{N'}$.
Check optimality: If all $\bar{c}_j \leq 0$ for $j \in N'$, then the current solution is optimal. Stop.
If there exists $\bar{c}_e > 0$ for some $e \in N'$, choose an entering variable $x_e$.
Determine the leaving variable $x_r$ by ratio test.
Update the basis and repeat.

This iterative process continues until the optimality condition is met or unboundedness is detected.

Degeneracy and Cycling Issues

Degeneracy occurs when a basic feasible solution has more than $d-m = n$ components equal to zero, or equivalently, fewer than $m$ positive components. In terms of basic solutions, degeneracy happens when one or more basic variables are zero in a basic feasible solution.

In degenerate cases, it is possible that in a pivot step, the minimum ratio in the ratio test is zero, i.e., $x_e^* = 0$. In this case, the objective function value does not improve in this iteration, and we might cycle back to a basis we have visited before, leading to an infinite loop.

To prevent cycling, various anti-cycling rules have been developed, such as Bland’s rule (choose the entering variable with the smallest index among those with positive reduced cost, and similarly for the leaving variable). These rules ensure that the Simplex Algorithm terminates in a finite number of steps, even in the presence of degeneracy.

Conclusion

In this lecture, we have provided an overview of the Simplex Algorithm, a fundamental method for solving linear programming problems. We explored both the geometric and algebraic interpretations of the algorithm.

Key Takeaways:

The Simplex Algorithm searches for an optimal solution by moving between vertices of the feasible region.
Vertices correspond to basic feasible solutions, which are characterized by linearly independent columns of the constraint matrix.
The algorithm iteratively improves the objective function by pivoting, i.e., changing the basis by exchanging entering and leaving variables.
Optimality is checked using reduced costs. If all reduced costs for non-basic variables are non-positive, the current solution is optimal.
Unboundedness is detected when it is possible to increase an entering variable indefinitely without violating feasibility.
Degeneracy can lead to cycling, but anti-cycling rules can prevent this.

The Simplex Algorithm, despite its theoretical exponential worst-case complexity, is highly effective in practice for solving linear programming problems. It remains a cornerstone algorithm in optimization and is widely used in various applications.

Further Questions and Topics:

How to find an initial basic feasible solution to start the Simplex Algorithm (Phase I)?
Detailed discussion of anti-cycling rules (e.g., Bland’s rule, lexicographic rule).
Computational efficiency and complexity of the Simplex Algorithm.
Implementation details and practical considerations for the Simplex Algorithm.
Alternative algorithms for linear programming, such as interior-point methods.

This lecture aimed to provide a comprehensive, yet accessible, introduction to the Simplex Algorithm. Further study and practice are recommended to gain a deeper understanding and proficiency in applying this powerful optimization technique.

--- title: "The Simplex Algorithm for Linear Programming" author: "Your Name" date: "2025-01-30" format: html: toc: true # Table of Contents toc-depth: 2 code-tools: true theme: cosmo # Or "journal" for Distill-like minimalism --- # Introduction This lecture provides a high-level description of the Simplex Algorithm, a method used to solve linear programming problems (LPs). We will avoid delving too deeply into the technical details, focusing instead on the fundamental concepts. ## Overview and Historical Context The Simplex Algorithm, developed by George Dantzig in the 1950s, is a cornerstone algorithm in optimization, particularly for linear programming. It marked a significant advancement in Operations Research and provided a systematic approach to solve a wide array of problems that could be formulated linearly. Its enduring relevance is underscored by its continued use in practical applications and commercial solvers, despite the existence of theoretically faster algorithms. ## Practical vs. Theoretical Performance The Simplex Algorithm's efficiency in practice contrasts with its theoretical worst-case exponential time complexity. Linear Programming problems are known to be solvable in polynomial time, meaning algorithms exist whose runtime is bounded by a polynomial function of the input size. However, the Simplex Algorithm, in contrived worst-case scenarios, can take exponential time. These worst-case instances are specifically designed and rarely occur in practical applications. This discrepancy explains why, for average-case scenarios and real-world problems, the Simplex Algorithm often outperforms theoretically superior polynomial-time algorithms in terms of speed and efficiency. ## Geometric and Algebraic Interpretations The Simplex Algorithm can be understood through two complementary lenses: geometric and algebraic. #### Geometric Interpretation: Geometrically, the Simplex Algorithm operates by traversing the vertices of the feasible region, which is a polyhedron. It starts at a vertex and iteratively moves to adjacent vertices that improve the objective function value. This process continues until an optimal vertex is reached, or it is determined that the problem is unbounded. The algorithm's steps can be visualized as walking along the edges of the polyhedron. #### Algebraic Interpretation: Algebraically, the Simplex Algorithm manipulates the constraint matrix through a process called pivoting. This involves changing the basis of the system of equations that define the feasible region. The algorithm iteratively transforms the matrix to find a solution that optimizes the objective function. This manipulation involves updating columns of the matrix to reflect movements from one vertex to another in the feasible region. The algebraic operations are designed to mirror the geometric movement along the edges of the polyhedron. We will explore both interpretations to provide a comprehensive understanding of the Simplex Algorithm. The geometric view offers intuition, while the algebraic view details the computational steps. # Linear Programming in Standard Form ## Definition of Standard Form For the Simplex Algorithm, we typically assume the linear programming problem is in **standard form**. A linear program in standard form is expressed as: $$\begin{aligned} \text{Maximize } & c^T x \\ \text{subject to } & Ax = b \\ & x \geq 0 \end{aligned}$$ where: - $x$ is the vector of variables to be determined. - $c$ is the vector of objective function coefficients. - $A$ is the constraint matrix. - $b$ is the vector of right-hand sides of the constraints. - The constraints $Ax = b$ are equations. - The constraints $x \geq 0$ are non-negativity constraints for all variables. It is important to note that any linear programming problem, regardless of its initial form (including inequalities, free variables, etc.), can be converted into this standard form without loss of generality. This conversion typically involves introducing slack, surplus, and artificial variables. ## Assumptions on the Constraint Matrix We make the following assumptions about the constraint matrix $A$: - Let $A$ be an $m \times d$ matrix, where $m$ is the number of constraints (equations) and $d$ is the number of variables. - We assume that the rows of $A$ are linearly independent. This implies that the rank of matrix $A$ is equal to $m$. - The assumption of linear independence is justified because if there are linearly dependent rows (equations), it means there is redundancy in the constraints. - If a row is linearly dependent on others, it is either redundant (if the right-hand side is consistent with the dependency) and can be removed, or the system is inconsistent (if the right-hand side is not consistent with the dependency), implying no solution exists. - If the rank of $A$ is less than $m$, either the system has no solution, or it contains redundant equations that can be eliminated without changing the feasible region. - For the rank of $A$ to be $m$, the number of columns $d$ must be greater than or equal to the number of rows $m$ ($d \geq m$). If $d < m$, the rank of $A$ can be at most $d < m$, contradicting the assumption of full row rank $m$. Let $n = d - m$. We can think of $n$ as the number of 'free' variables after satisfying the $m$ equality constraints. Thus, the number of variables $d$ can be expressed as $m + n$. ## Variables, Constraints, and Dimensions In summary, for a linear program in standard form $\max \{c^T x : Ax = b, x \geq 0\}$ with an $m \times d$ constraint matrix $A$ of rank $m$: - Number of constraints (equations): $m$. - Number of variables: $d = m + n$, where $n = d - m$. - We assume rows of $A$ are linearly independent, i.e., $\text{rank}(A) = m$. # Geometric Properties of the Feasible Region ## Feasible Region as Intersection of Polyhedra The feasible region of a linear programming problem in standard form is defined by two sets of constraints: $Ax = b$ and $x \geq 0$. We can define these sets separately: - $P_{eq} = \{x \in \mathbb{R}^d : Ax = b\}$: This set represents the solutions to the system of linear equations $Ax = b$. It is an affine subspace of $\mathbb{R}^d$. If $b=0$, it is a linear subspace. For $b \neq 0$, it is a translation of the null space of $A$. - $P_{\geq} = \{x \in \mathbb{R}^d : x \geq 0\}$: This set represents the non-negativity constraints. It is a polyhedral cone (specifically, the non-negative orthant in $\mathbb{R}^d$). The feasible region $P$ of the linear program is the intersection of these two sets: $$P = P_{eq} \cap P_{\geq} = \{x \in \mathbb{R}^d : Ax = b, x \geq 0\}$$ Since both $P_{eq}$ and $P_{\geq}$ are polyhedra, their intersection $P$ is also a polyhedron. Thus, the feasible region of a linear program is always a polyhedron. ## Vertices and Extreme Rays The feasible region $P$ is a polyhedron. Key geometric features of polyhedra are vertices, edges, and faces. - **Vertices:** Vertices are extreme points of the polyhedron. A polyhedron has vertices if and only if it does not contain a line (a set of the form $\{x + \lambda d : \lambda \in \mathbb{R}\}$ for some $x, d \neq 0\}$). - **Existence of Vertices in $P_{\geq}$:** The non-negativity constraint set $P_{\geq} = \{x \in \mathbb{R}^d : x \geq 0\}$ definitely has at least one vertex, which is the origin ($x = 0$). Furthermore, $P_{\geq}$ does not contain any line, as the non-negativity constraints restrict movement in certain directions. - **Vertices in $P = P_{eq} \cap P_{\geq}$:** Since $P_{\geq}$ has vertices and does not contain a line, their intersection $P = P_{eq} \cap P_{\geq}$ will also have vertices (if non-empty) and will not contain a line. Therefore, the feasible region $P$ of a linear program, if it is not empty, has vertices. ## Dimension of the Solution Space The dimension of the feasible region $P$ can be determined. - **Dimension of $P_{eq}$:** The set $P_{eq} = \{x \in \mathbb{R}^d : Ax = b\}$ is the solution set to a system of $m$ linearly independent equations in $d$ variables. The dimension of this affine subspace is $d - m = n$. - **Dimension of $P = P_{eq} \cap P_{\geq}$:** Intersecting $P_{eq}$ with $P_{\geq}$ (which are inequalities) can further reduce the dimension or create facets, edges, and vertices, but it does not increase the dimension beyond that of $P_{eq}$. Therefore, the dimension of the feasible region $P$ is at most $n = d - m$. It is a polyhedron of dimension at most $n$. - **Full Dimension in Affine Subspace:** The polyhedron $P$ is of dimension at most $n$ in $\mathbb{R}^d$. However, within the affine subspace $P_{eq}$ (which has dimension $n$), $P$ can be considered to have full dimension if it is non-empty and bounded within $P_{eq}$. - **Analogy:** Imagine a 2D square in a 3D space. The square is a 2D object (full dimension in 2D), but it is embedded in a 3D space. Similarly, our feasible region $P$ of dimension at most $n$ is embedded in $\mathbb{R}^d$. # Optimality and Unboundedness ## Unboundedness and Extreme Rays When solving a linear programming problem, we can encounter three scenarios: 1. **Infeasibility:** The feasible region $P$ is empty. In this case, no solution exists that satisfies all constraints. The Simplex Algorithm can detect infeasibility. 2. **Unboundedness:** The objective function can be increased indefinitely within the feasible region. In this case, there is no optimal solution, but rather solutions with arbitrarily large objective values. 3. **Optimality:** There exists at least one optimal solution that maximizes (or minimizes) the objective function within the feasible region. There might be a unique optimal solution or multiple optimal solutions. A key theorem related to unboundedness states: ::: tcolorbox A linear programming problem is unbounded if and only if there exists a vertex $v$ of the feasible region $P$ and an **extreme ray** $r$ emanating from $v$ such that moving along this ray increases the objective function value indefinitely. An extreme ray is essentially an edge of the polyhedron that extends to infinity. ::: An extreme ray $r$ is a direction such that for any $\lambda \geq 0$, $v + \lambda r$ remains in the feasible region $P$, and the objective function $c^T (v + \lambda r) = c^T v + \lambda c^T r$ increases as $\lambda \rightarrow \infty$ (i.e., $c^T r > 0$ for maximization problems). In this case, $r$ is called an **improving extreme ray**. ::: tcolorbox In the figure above, if $x$ is a vertex and $R$ represents an extreme ray, and if the objective function direction $c$ is such that moving along $R$ increases $c^Tx$, then the problem is unbounded. The other edges emanating from $x$ might lead to lower or equal objective values. ::: ## Existence of an Optimal Vertex Solution A fundamental result in linear programming is that if an optimal solution exists, then there exists an optimal solution that is a vertex of the feasible region. ::: tcolorbox If a linear programming problem has an optimal solution, then it has an optimal solution that is a vertex of the feasible region $P$. ::: :::: proof *Intuitive Explanation.* Suppose an optimal solution $x^*$ exists, and let $C_0 = c^T x^*$ be the optimal objective value. Consider the set of all optimal solutions $F = \{x \in P : c^T x = C_0\}$. This set $F$ is a face of the polyhedron $P$ (possibly $P$ itself, or a vertex, edge, etc.). Since $P$ has vertices (if it is not empty and bounded in at least some directions), and $F$ is a face of $P$, $F$ must also have vertices. Every vertex of $F$ is also a vertex of $P$. Since all points in $F$ are optimal, any vertex of $F$ is an optimal vertex of $P$. ::: tcolorbox In the left figure, the objective function $c$ direction leads to a unique optimal vertex. In the right figure, with $c$ pointing vertically, the entire top face $F$ is optimal, and its vertices are also optimal vertices of $P$. ::: This theorem is crucial because it reduces the search for an optimal solution from the entire feasible region (which can be continuous and infinite) to a finite set of vertices. ◻ :::: ## The Simplex Algorithm's Search Approach The Simplex Algorithm leverages the Optimal Vertex Theorem. It efficiently searches for an optimal solution by: 1. **Vertex Exploration:** The algorithm explores the vertices of the feasible region $P$. 2. **Iterative Movement:** It starts at an initial feasible vertex and iteratively moves from one vertex to an adjacent vertex along an edge of the polyhedron. 3. **Objective Function Improvement:** At each step, it chooses to move to an adjacent vertex that improves (for maximization) or at least does not worsen the objective function value. 4. **Optimality Check:** At each vertex, it checks if it is optimal. A vertex is optimal if no adjacent vertex offers a better objective function value. This is based on the convexity of the problem. If a vertex is locally optimal (best among its neighbors), it is globally optimal in a convex problem. 5. **Unboundedness Detection:** The algorithm can also detect if the problem is unbounded by identifying an improving extreme ray. Geometrically, the Simplex Algorithm can be visualized as a walk on the vertices and edges of the polyhedron, always moving in a direction that improves the objective function, until an optimal vertex is found or unboundedness is detected. # Vertices and Basic Feasible Solutions ## Facets and Non-negativity Constraints Consider the feasible region $P = \{x \in \mathbb{R}^d : Ax = b, x \geq 0\}$. The facets of this polyhedron are formed by the constraints. Specifically, the non-negativity constraints $x_j \geq 0$ define some of the facets. In an $n$-dimensional polyhedron, vertices are formed by the intersection of at least $n$ facets. In our case, the dimension of $P$ is at most $n = d-m$. Consider an example in $\mathbb{R}^3$ (dimension 3). To define a vertex, we need at least 3 intersecting facets. For a polyhedron defined by non-negativity constraints, the facets are given by setting some variables to zero, i.e., $x_j = 0$. ::: tcolorbox Consider a 3-dimensional polyhedron. Facets can correspond to constraints like $x_1 \geq 0, x_2 \geq 0, x_3 \geq 0$, and possibly other linear inequality constraints. A vertex might be formed by the intersection of facets like $x_1 = 0, x_2 = 0, x_3 = 0$, or $x_1 = 0, x_2 = 0$, and another constraint becoming active as an equality. ::: In general, for our feasible region $P$, the facets are related to the constraints $x_j \geq 0$. ## Characterizing Vertices by Zero Components ::: tcolorbox If $v$ is a vertex of the feasible region $P \subseteq \mathbb{R}^d$ of dimension $N \leq d-m$, then at least $N$ components of $v$ must be zero. In fact, for our standard form LP, at least $n = d-m$ components of a vertex must be zero. ::: For a polyhedron of dimension $N$, a vertex is typically formed by the intersection of at least $N$ hyperplanes defining its facets. In our case, some of these facets are given by the non-negativity constraints $x_j \geq 0$, i.e., $x_j = 0$ at the boundary. ::: tcolorbox Consider a 3-dimensional polyhedron (dimension $N=3$) in $\mathbb{R}^9$ ($d=9$). For a vertex, we expect at least 3 facets to intersect. If these facets are from non-negativity constraints, it means at least 3 variables are zero at a vertex. In fact, in the context of standard form LP, it turns out that for a vertex, at least $n = d-m = 9 - 6 = 3$ (if $m=6$) components must be zero. ::: ## Linear Independence of Columns at a Vertex A more precise characterization of vertices involves the concept of linear independence of columns of the matrix $A$. ::: tcolorbox Let $v$ be a vertex of the feasible region $P = \{x \in \mathbb{R}^d : Ax = b, x \geq 0\}$. Let $J = \{j : v_j > 0\}$ be the set of indices corresponding to positive components of $v$. Then the columns of $A$ indexed by $J$, i.e., $\{A_j\}_{j \in J}$, are linearly independent. ::: ::: proof *Proof.* We will prove this by contradiction. Suppose the columns $\{A_j\}_{j \in J}$ are linearly dependent. Then there exist coefficients $\{\lambda_j\}_{j \in J}$, not all zero, such that $\sum_{j \in J} \lambda_j A_j = 0$. Define two vectors $u, w \in \mathbb{R}^d$ as follows:Then $\sum_{j=1}^d u_j A_j = \sum_{j \in J} \lambda_j A_j = 0$ and $\sum_{j=1}^d w_j A_j = \sum_{j \in J} (-\lambda_j) A_j = 0$. Consider vectors $v^+ = v + \epsilon u$ and $v^- = v + \epsilon w = v - \epsilon u$ for a small $\epsilon > 0$. For $j \notin J$, $v_j = 0$, $u_j = w_j = 0$, so $v^+_j = v^-_j = 0 \geq 0$. For $j \in J$, $v_j > 0$. If we choose $\epsilon$ small enough, then $v_j \pm \epsilon \lambda_j > 0$ for all $j \in J$. Thus, for sufficiently small $\epsilon > 0$, $v^+ \geq 0$ and $v^- \geq 0$. Also, $A v^+ = A(v + \epsilon u) = Av + \epsilon Au = b + \epsilon \sum_{j=1}^d u_j A_j = b + \epsilon \cdot 0 = b$. Similarly, $A v^- = A(v - \epsilon u) = Av - \epsilon Au = b - \epsilon \cdot 0 = b$. Hence, $v^+ \in P$ and $v^- \in P$. Furthermore, $v = \frac{1}{2} v^+ + \frac{1}{2} v^-$, since $v^+ + v^- = (v + \epsilon u) + (v - \epsilon u) = 2v$. If $u \neq 0$, then $v^+ \neq v^-$ and $v$ is a convex combination of two distinct points in $P$, which contradicts the assumption that $v$ is a vertex. Therefore, the assumption that $\{A_j\}_{j \in J}$ are linearly dependent must be false, and they must be linearly independent. ◻ ::: ## Basic Solutions: Degenerate and Non-degenerate ::: tcolorbox A solution $x \in \mathbb{R}^d$ to $Ax = b$ is called a **basic solution** if the columns of $A$ corresponding to the non-zero components of $x$ are linearly independent. ::: From the previous theorem, we know that every vertex of the feasible region is a basic solution. ::: tcolorbox A basic solution $x \in \mathbb{R}^d$ is called **non-degenerate** if the number of non-zero components of $x$ is exactly $m$ (which is the rank of $A$). ::: ::: tcolorbox A basic solution $x \in \mathbb{R}^d$ is called **degenerate** if the number of non-zero components of $x$ is less than $m$. ::: If a basic solution is non-degenerate, the $m$ linearly independent columns corresponding to its non-zero components form a basis for the column space of $A$ (assuming rank$(A)=m$). ## Bijective Correspondence: Vertices and Basic Solutions We have shown that every vertex is a basic feasible solution. The converse is also true: ::: tcolorbox Every basic feasible solution is a vertex of the feasible region $P$. ::: ::: proof *Proof.* Let $v$ be a basic feasible solution. Suppose $v = \alpha y + (1-\alpha) w$ for some $y, w \in P$ and $0 < \alpha < 1$. We want to show that $y = w = v$. Let $J = \{j : v_j = 0\}$. For any $j \in J$, $v_j = 0 = \alpha y_j + (1-\alpha) w_j$. Since $\alpha > 0, 1-\alpha > 0$ and $y_j \geq 0, w_j \geq 0$, it must be that $y_j = 0$ and $w_j = 0$ for all $j \in J$. Thus, $y_j = w_j = v_j = 0$ for all $j \in J$. This means that non-zero components of $v, y, w$ can only be in the index set $B = \{j : v_j > 0\}$. From $Ay = b$ and $Aw = b$, we have $\sum_{j=1}^d y_j A_j = b$ and $\sum_{j=1}^d w_j A_j = b$. Subtracting these, we get $\sum_{j=1}^d (y_j - w_j) A_j = 0$. Since $y_j - w_j = 0$ for $j \in J$, we have $\sum_{j \notin J} (y_j - w_j) A_j = 0$, which is $\sum_{j \in B} (y_j - w_j) A_j = 0$. Since $v$ is a basic solution, the columns $\{A_j\}_{j \in B}$ are linearly independent. Therefore, it must be that $y_j - w_j = 0$ for all $j \in B$. Thus, $y_j = w_j$ for all $j \in B$. Since we also have $y_j = w_j = 0 = v_j$ for $j \in J$, it follows that $y_j = w_j = v_j$ for all $j = 1, \dots, d$. Therefore, $y = w = v$. This shows that $v$ cannot be expressed as a convex combination of two distinct points in $P$, so $v$ must be a vertex of $P$. ◻ ::: Combining the two theorems, we have a bijective correspondence: ::: tcolorbox A vector $v \in \mathbb{R}^d$ is a vertex of the feasible region $P$ if and only if $v$ is a basic feasible solution. ::: This result is fundamental because it links the geometric concept of vertices with the algebraic concept of basic feasible solutions. The Simplex Algorithm exploits this correspondence by searching for an optimal solution among the basic feasible solutions. # Algebraic Perspective of the Simplex Method ## Basic Feasible Sets and their Costs We have established that vertices of the feasible region correspond to basic feasible solutions. Now let's consider the algebraic interpretation of moving between vertices and optimizing the objective function. For a given basic feasible solution (vertex), we can associate it with a set of columns of the matrix $A$ that are linearly independent and correspond to the non-zero components of the solution. We call such a set a **basic set of columns**. Given a basic set of columns $B$ (of indices), we can attempt to express the vector $b$ as a linear combination of these columns with non-negative coefficients. If successful, this gives us a basic feasible solution. Let $B$ be a set of indices of columns of $A$. Let $A_B$ be the submatrix of $A$ consisting of columns indexed by $B$. If $A_B$ is invertible (which implies $|B| = m$ and the columns are linearly independent, forming a basis for the column space of $A$), then we can solve $A_B x_B = b$ for $x_B = A_B^{-1} b$. Setting $x_j = 0$ for $j \notin B$ gives a basic solution $x$. If $x_B \geq 0$, then $x$ is a basic feasible solution, and thus a vertex. The cost of a basic feasible solution $x$ is given by $c^T x = \sum_{j \in B} c_j x_j = c_B^T x_B = c_B^T A_B^{-1} b$, where $c_B$ is the vector of cost coefficients corresponding to the basic columns. ## Moving Between Vertices: Pivoting In the Simplex Algorithm, moving from one vertex to an adjacent vertex corresponds to changing the basic set of columns. This change is done through a process called **pivoting**. Starting from a basic feasible solution associated with a basic set $B$, we consider moving to an adjacent vertex by: 1. **Entering Variable:** Choosing a non-basic variable (index $j \notin B$) to become basic (enter the basis). This choice is typically guided by the reduced cost, which we will discuss later. 2. **Leaving Variable:** Determining a basic variable (index $i \in B$) to become non-basic (leave the basis). This is done to maintain feasibility and to move to an adjacent vertex. The process of exchanging a basic variable with a non-basic variable is called a pivot operation. Algebraically, this corresponds to updating the basis $B$ by removing one index and adding another. ## Adjacency of Bases and Vertices Two vertices are adjacent if they are connected by an edge of the polyhedron. Algebraically, adjacency of vertices is related to the bases associated with them. For non-degenerate basic feasible solutions (vertices), two vertices are adjacent if their corresponding bases differ by exactly one column. That is, if $B_1$ and $B_2$ are bases corresponding to two adjacent vertices, then $|B_1 \cap B_2| = m-1$. One column leaves the basis and another enters. In the case of degenerate vertices, the situation can be more complex. Multiple bases can correspond to the same degenerate vertex, and moving from one basis to another might not necessarily lead to a different vertex, but it is still a step in the Simplex Algorithm. # The Simplex Algorithm: Detailed Steps ## Bases, Non-bases, and Partitioning At each iteration, the Simplex Algorithm works with a **basis** $B$, which is a set of $m$ linearly independent columns of $A$. Let $N$ be the set of indices of the non-basic columns, so $B \cup N = \{1, 2, \dots, d\}$ and $B \cap N = \emptyset$. We partition the matrix $A$ into $[A_B | A_N]$, the variable vector $x$ into $\begin{pmatrix} x_B \\ x_N \end{pmatrix}$, and the cost vector $c$ into $\begin{pmatrix} c_B \\ c_N \end{pmatrix}$. The system $Ax = b$ can be written as $A_B x_B + A_N x_N = b$. Since $A_B$ is invertible, we can express $x_B$ in terms of $x_N$:The objective function becomes:Let $\bar{b} = A_B^{-1} b$ and $R = A_B^{-1} A_N$. Then $x_B = \bar{b} - R x_N$. Let $\bar{c}_N^T = c_N^T - c_B^T A_B^{-1} A_N = c_N^T - c_B^T R$ be the vector of **reduced costs** for non-basic variables, and $z_0 = c_B^T A_B^{-1} b = c_B^T \bar{b}$ be the current objective value. Then the objective function is $c^T x = z_0 + \bar{c}_N^T x_N$. ## Reduced Costs and Optimality Condition The vector $\bar{c}_N^T = c_N^T - c_B^T A_B^{-1} A_N$ is called the vector of **reduced costs** for the non-basic variables. The $j$-th component of $\bar{c}_N$ (for $j \in N$) is given by $\bar{c}_j = c_j - c_B^T A_B^{-1}A_j$. ::: tcolorbox For a maximization problem, a basic feasible solution $x$ associated with basis $B$ is optimal if and only if all reduced costs for non-basic variables are non-positive, i.e., $\bar{c}_j \leq 0$ for all $j \in N$. ::: If $\bar{c}_j \leq 0$ for all $j \in N$, increasing any non-basic variable $x_j$ (from 0) will not increase the objective function value (it may decrease or stay the same). Thus, no improvement is possible, and the current basic feasible solution is optimal. ## Entering and Leaving Variables (Pivoting) If there exists some $j \in N$ such that $\bar{c}_j > 0$, then it is possible to increase the objective function by increasing the non-basic variable $x_j$. We choose such a variable $x_j$ to be the **entering variable**. A common rule is to choose the variable with the most positive reduced cost (largest $\bar{c}_j$). Once we choose an entering variable $x_j$ (index $j = e$), we need to determine the **leaving variable**. We increase $x_e$ from 0 while keeping other non-basic variables at 0 ($x_k = 0$ for $k \in N, k \neq e$). Then $x_B = \bar{b} - R x_N$ becomes $x_B = \bar{b} - A_B^{-1} A_e x_e = \bar{b} - y_e x_e$, where $y_e = A_B^{-1} A_e$ is the $e$-th column of $R = A_B^{-1} A_N$. We need to ensure that $x_B = \bar{b} - y_e x_e \geq 0$ and $x_e \geq 0$. Since we are increasing $x_e$ from 0, we need to find the maximum value of $x_e$ such that $x_B \geq 0$. For each $i \in B$, we have $x_i = \bar{b}_i - y_{ie} x_e \geq 0$. - If $y_{ie} \leq 0$, then $\bar{b}_i - y_{ie} x_e \geq 0$ for all $x_e \geq 0$ (if $\bar{b}_i \geq 0$). - If $y_{ie} > 0$, then we need $x_e \leq \frac{\bar{b}_i}{y_{ie}}$. We need to choose $x_e$ such that $x_e \leq \frac{\bar{b}_i}{y_{ie}}$ for all $i \in B$ with $y_{ie} > 0$. To maximize the increase in $x_e$ while maintaining feasibility, we choose:If all $y_{ie} \leq 0$ for all $i \in B$, and there is some $\bar{c}_e > 0$, then we can increase $x_e$ indefinitely without violating feasibility, and the objective function is unbounded. In this case, the problem is unbounded. If $x_e^*$ is finite and positive, let $r$ be an index in $B$ that achieves the minimum ratio, i.e., $x_e^* = \frac{\bar{b}_r}{y_{re}}$. Then the basic variable $x_r$ becomes 0 when $x_e = x_e^*$. We choose $x_r$ (index $r$) as the **leaving variable**. ## Basis Update and Iteration After determining the entering variable $x_e$ (index $e$) and the leaving variable $x_r$ (index $r$), we update the basis $B$ by removing $r$ and adding $e$. The new basis is $B' = (B \setminus \{r\}) \cup \{e\}$. The new non-basis is $N' = (N \setminus \{e\}) \cup \{r\}$. We then repeat the process with the new basis $B'$: 1. Calculate $A_{B'}^{-1}$. 2. Calculate $\bar{b}' = A_{B'}^{-1} b$. 3. Calculate reduced costs $\bar{c}_{N'}^T = c_{N'}^T - c_{B'}^T A_{B'}^{-1} A_{N'}$. 4. Check optimality: If all $\bar{c}_j \leq 0$ for $j \in N'$, then the current solution is optimal. Stop. 5. If there exists $\bar{c}_e > 0$ for some $e \in N'$, choose an entering variable $x_e$. 6. Determine the leaving variable $x_r$ by ratio test. 7. Update the basis and repeat. This iterative process continues until the optimality condition is met or unboundedness is detected. ## Degeneracy and Cycling Issues **Degeneracy** occurs when a basic feasible solution has more than $d-m = n$ components equal to zero, or equivalently, fewer than $m$ positive components. In terms of basic solutions, degeneracy happens when one or more basic variables are zero in a basic feasible solution. In degenerate cases, it is possible that in a pivot step, the minimum ratio in the ratio test is zero, i.e., $x_e^* = 0$. In this case, the objective function value does not improve in this iteration, and we might cycle back to a basis we have visited before, leading to an infinite loop. To prevent cycling, various anti-cycling rules have been developed, such as Bland's rule (choose the entering variable with the smallest index among those with positive reduced cost, and similarly for the leaving variable). These rules ensure that the Simplex Algorithm terminates in a finite number of steps, even in the presence of degeneracy. # Conclusion In this lecture, we have provided an overview of the Simplex Algorithm, a fundamental method for solving linear programming problems. We explored both the geometric and algebraic interpretations of the algorithm. **Key Takeaways:** - The Simplex Algorithm searches for an optimal solution by moving between vertices of the feasible region. - Vertices correspond to basic feasible solutions, which are characterized by linearly independent columns of the constraint matrix. - The algorithm iteratively improves the objective function by pivoting, i.e., changing the basis by exchanging entering and leaving variables. - Optimality is checked using reduced costs. If all reduced costs for non-basic variables are non-positive, the current solution is optimal. - Unboundedness is detected when it is possible to increase an entering variable indefinitely without violating feasibility. - Degeneracy can lead to cycling, but anti-cycling rules can prevent this. The Simplex Algorithm, despite its theoretical exponential worst-case complexity, is highly effective in practice for solving linear programming problems. It remains a cornerstone algorithm in optimization and is widely used in various applications. **Further Questions and Topics:** - How to find an initial basic feasible solution to start the Simplex Algorithm (Phase I)? - Detailed discussion of anti-cycling rules (e.g., Bland's rule, lexicographic rule). - Computational efficiency and complexity of the Simplex Algorithm. - Implementation details and practical considerations for the Simplex Algorithm. - Alternative algorithms for linear programming, such as interior-point methods. This lecture aimed to provide a comprehensive, yet accessible, introduction to the Simplex Algorithm. Further study and practice are recommended to gain a deeper understanding and proficiency in applying this powerful optimization technique.