Lecture Notes on Linear Programming

Author

Your Name

Published

January 30, 2025

Introduction

In this lecture, we will examine the canonical and standard forms of linear programming problems. We will discuss the transformations between these forms and introduce the Simplex Algorithm, a fundamental method for solving linear programs. Key concepts to be covered include problem formulation, basis and non-basis variables, system reformulation using matrix algebra, and the connection to echelon forms in linear systems. This lecture will lay the groundwork for understanding the algorithmic steps of the Simplex method in subsequent discussions.

Linear Programming Formats

Canonical Form

A Linear Programming problem in canonical form is expressed as: $$ \[\begin{aligned} \text{maximize } & \mathbf{c}^T \mathbf{x}\\ \text{subject to } & \mathbf{A}\mathbf{x}\leq \mathbf{b}\\ & \mathbf{x}\geq \mathbf{0} \end{aligned}\]

$$ where \(\mathbf{A}\) is an \(m \times n\) matrix, \(\mathbf{x}\in \mathbb{R}^n\) is the variable vector, \(\mathbf{b}\in \mathbb{R}^m\) is the constraint vector, and \(\mathbf{c}\in \mathbb{R}^n\) is the cost vector. All constraints are inequalities of the form "less than or equal to," and all variables are non-negative.

A linear programming problem is in canonical form if it is expressed as: $$ \[\begin{aligned} \text{maximize } & \mathbf{c}^T \mathbf{x}\\ \text{subject to } & \mathbf{A}\mathbf{x}\leq \mathbf{b}\\ & \mathbf{x}\geq \mathbf{0} \end{aligned}\]

$$ where \(\mathbf{A}\) is an \(m \times n\) matrix, \(\mathbf{x}\in \mathbb{R}^n\) is the variable vector, \(\mathbf{b}\in \mathbb{R}^m\) is the constraint vector, and \(\mathbf{c}\in \mathbb{R}^n\) is the cost vector. All constraints are inequalities of the form "less than or equal to," and all variables are non-negative.

Standard Form

A Linear Programming problem in standard form is expressed as: $$ \[\begin{aligned} \text{maximize } & \mathbf{c}^T \mathbf{x}\\ \text{subject to } & \mathbf{A}\mathbf{x}= \mathbf{b}\\ & \mathbf{x}\geq \mathbf{0} \end{aligned}\]

$$ where \(\mathbf{A}\) is an \(m \times n\) matrix, \(\mathbf{x}\in \mathbb{R}^n\) is the variable vector, \(\mathbf{b}\in \mathbb{R}^m\) is the constraint vector, and \(\mathbf{c}\in \mathbb{R}^n\) is the cost vector. All constraints are equalities, and all variables are non-negative.

A linear programming problem is in standard form if it is expressed as: $$ \[\begin{aligned} \text{maximize } & \mathbf{c}^T \mathbf{x}\\ \text{subject to } & \mathbf{A}\mathbf{x}= \mathbf{b}\\ & \mathbf{x}\geq \mathbf{0} \end{aligned}\]

$$ where \(\mathbf{A}\) is an \(m \times n\) matrix, \(\mathbf{x}\in \mathbb{R}^n\) is the variable vector, \(\mathbf{b}\in \mathbb{R}^m\) is the constraint vector, and \(\mathbf{c}\in \mathbb{R}^n\) is the cost vector. All constraints are equalities, and all variables are non-negative.

Conversion Between Forms

Standard to Canonical

To convert a linear program from standard form to canonical form, we replace each equality constraint with two inequality constraints. Specifically, an equation \(a_i^T \mathbf{x}= b_i\) is equivalent to the pair of inequalities \(a_i^T \mathbf{x}\leq b_i\) and \(a_i^T \mathbf{x}\geq b_i\). The latter inequality can be rewritten as \(-a_i^T \mathbf{x}\leq -b_i\). Thus, each equality constraint in the standard form is replaced by these two inequalities to achieve canonical form.

Canonical to Standard

To convert a linear program from canonical form to standard form, we must transform inequality constraints into equality constraints. For each inequality constraint of the form \(a_i^T \mathbf{x}\leq b_i\), we introduce a non-negative slack variable \(s_i \geq 0\) such that \(a_i^T \mathbf{x}+ s_i = b_i\). This slack variable represents the "slack" or difference between the left-hand side and the right-hand side of the inequality. By adding a slack variable to each inequality, we convert each inequality constraint into an equality constraint, thus transforming the problem into standard form. The slack variables are included in the variable vector and have a coefficient of zero in the objective function, as they do not contribute to the maximization or minimization goal.

A slack variable is a non-negative variable, denoted as \(s_i \geq 0\), introduced to transform an inequality constraint of the form \(a_i^T \mathbf{x}\leq b_i\) into an equality constraint \(a_i^T \mathbf{x}+ s_i = b_i\). This variable represents the difference between the left-hand side and the right-hand side of the inequality.

Solving Linear Programming Problems

The Simplex Algorithm

Historical Context and Practical Efficiency

The Simplex Algorithm, developed in the 1950s, is a cornerstone algorithm for solving linear programming problems. Despite being decades old and theoretically outperformed by newer algorithms in terms of worst-case complexity, the Simplex Algorithm remains the most efficient method in practice for a wide range of linear programming problems.

Empirical Superiority vs. Theoretical Limitations

While the Simplex Algorithm is not theoretically optimal in all scenarios, particularly regarding worst-case complexity, its practical performance is exceptional. Its simplicity and speed in solving typical real-world problems are unmatched. Although instances exist where the Simplex Algorithm exhibits exponential time complexity, such cases are rarely encountered in practical applications.

Average-Case Complexity and Practical Observations

The average-case complexity of the Simplex Algorithm is observed to be polynomial, often empirically estimated to be of low degree in terms of the number of constraints (\(m\)) and variables (\(n\)). While a rigorous theoretical average-case analysis is challenging due to the need for probabilistic assumptions on input distributions, both experimental results and practical experience suggest a complexity around \(O(m^p n^q)\) where \(p\) and \(q\) are small integers, often less than or equal to 3. This polynomial behavior in typical cases explains its continued dominance in practical linear programming.

The Simplex Algorithm, while not theoretically optimal in all cases, exhibits polynomial average-case complexity, often around \(O(m^p n^q)\) where \(p\) and \(q\) are small integers. This explains its continued dominance in practical linear programming despite its exponential worst-case complexity.

Problem Setup in Standard Form

We consider a linear programming problem in standard form: $$ \[\begin{aligned} \text{maximize } & \mathbf{c}^T \mathbf{x}\\ \text{subject to } & \mathbf{A}\mathbf{x}= \mathbf{b}\\ & \mathbf{x}\geq \mathbf{0} \end{aligned}\]

$$ where \(\mathbf{x}\in \mathbb{R}^D\) is the vector of variables, \(\mathbf{A}\) is an \(M \times D\) constraint matrix, \(\mathbf{b}\in \mathbb{R}^M\) is the vector of constants, and \(\mathbf{c}\in \mathbb{R}^D\) is the cost vector. We assume that the number of variables \(D\) is given by \(D = M + N\), where \(N \geq 0\).

Assumption of Full Row Rank

Without loss of generality, we assume that the constraint matrix \(\mathbf{A}\) has full row rank, i.e., rank(\(\mathbf{A}\)) = \(M\). This implies that the \(M\) rows of \(\mathbf{A}\), corresponding to the \(M\) equality constraints, are linearly independent.

Justification for Full Row Rank Assumption

Assuming full row rank is not restrictive because any linear dependencies in the constraints either lead to redundant equations, which can be eliminated without altering the feasible region, or to inconsistencies, indicating no solution exists. Techniques like Gaussian elimination can be used to reduce the system \(\mathbf{A}\mathbf{x}= \mathbf{b}\) to a form where the constraint matrix has full row rank, provided the system is consistent. For \(\mathbf{A}\) to have full row rank \(M\), it is necessary that the number of columns \(D\) is at least as large as the number of rows \(M\) (\(D \geq M\)).

The assumption that the constraint matrix \(\mathbf{A}\) has full row rank is not restrictive. Any linear dependencies in the constraints either lead to redundant equations (which can be eliminated) or to inconsistencies (indicating no solution exists).

Basis and Partitioning

Definition of a Basis

A basis \(\mathcal{B}\) is a set of \(M\) indices of columns of \(\mathbf{A}\) such that the columns of \(\mathbf{A}\) indexed by \(\mathcal{B}\) are linearly independent. Let \(\mathcal{B} = \{B_1, B_2, \dots, B_M\}\) be the set of basis indices. The set of vectors \(\{\mathbf{A}_{B_1}, \mathbf{A}_{B_2}, \dots, \mathbf{A}_{B_M}\}\) forms a basis for the column space of \(\mathbf{A}\) (and for \(\mathbb{R}^M\) since \(\mathbf{A}\) has full row rank). The order within \(\mathcal{B}\) is relevant for implementation but not conceptually at this stage.

A basis \(\mathcal{B}\) is a set of \(M\) indices of columns of \(\mathbf{A}\) such that the columns of \(\mathbf{A}\) indexed by \(\mathcal{B}\) are linearly independent.

Basic and Non-Basic Variables

Given a basis \(\mathcal{B}\), we classify the variables \(\mathbf{x}\) into basic variables \(\mathbf{x}_B\) and non-basic variables \(\mathbf{x}_N\). The basic variables \(\mathbf{x}_B\) correspond to the indices in \(\mathcal{B}\), i.e., \(\mathbf{x}_B= (x_j)_{j \in \mathcal{B}}\). The non-basic variables \(\mathbf{x}_N\) correspond to the remaining indices \(\mathcal{N} = \{1, 2, \dots, D\} \setminus \mathcal{B}\), i.e., \(\mathbf{x}_N= (x_j)_{j \in \mathcal{N}}\).

Given a basis \(\mathcal{B}\), basic variables \(\mathbf{x}_B\) correspond to the indices in \(\mathcal{B}\), and non-basic variables \(\mathbf{x}_N\) correspond to the remaining indices.

Partitioning \(\mathbf{A}\), \(\mathbf{x}\), and \(\mathbf{c}\)

Corresponding to the partition of variables, we partition the matrix \(\mathbf{A}\) and vectors \(\mathbf{x}\) and \(\mathbf{c}\):

  • \(\mathbf{A}_B\): \(M \times M\) matrix formed by the columns of \(\mathbf{A}\) with indices in \(\mathcal{B}\).

  • \(\mathbf{A}_N\): \(M \times N\) matrix formed by the columns of \(\mathbf{A}\) with indices in \(\mathcal{N}\).

  • \(\mathbf{x}_B\): Vector of basic variables.

  • \(\mathbf{x}_N\): Vector of non-basic variables.

  • \(\mathbf{c}_B\): Vector of cost coefficients corresponding to basic variables.

  • \(\mathbf{c}_N\): Vector of cost coefficients corresponding to non-basic variables.

The system of equations \(\mathbf{A}\mathbf{x}= \mathbf{b}\) can then be written as: \[\mathbf{A}_B\mathbf{x}_B+ \mathbf{A}_N\mathbf{x}_N= \mathbf{b}\]

Invertibility of the Basis Matrix \(\mathbf{A}_B\)

Since the columns of \(\mathbf{A}_B\) are linearly independent and \(\mathbf{A}_B\) is an \(M \times M\) matrix, \(\mathbf{A}_B\) is invertible. This invertibility is crucial for expressing basic variables in terms of non-basic variables.

The basis matrix \(\mathbf{A}_B\) is invertible because its columns are linearly independent and it is an \(M \times M\) matrix.

Derivation of the Reduced System

Premultiplying the equation \(\mathbf{A}_B\mathbf{x}_B+ \mathbf{A}_N\mathbf{x}_N= \mathbf{b}\) by \(\mathbf{A}_B^{-1}\), we obtain: \[\mathbf{A}_B^{-1}(\mathbf{A}_B\mathbf{x}_B+ \mathbf{A}_N\mathbf{x}_N) = \mathbf{A}_B^{-1}\mathbf{b}\] \[\mathbf{A}_B^{-1}\mathbf{A}_B\mathbf{x}_B+ \mathbf{A}_B^{-1}\mathbf{A}_N\mathbf{x}_N= \mathbf{A}_B^{-1}\mathbf{b}\] \[\mathbf{I} \mathbf{x}_B+ \mathbf{A}_B^{-1}\mathbf{A}_N\mathbf{x}_N= \mathbf{A}_B^{-1}\mathbf{b}\] \[\mathbf{x}_B+ \mathbf{A}_B^{-1}\mathbf{A}_N\mathbf{x}_N= \mathbf{A}_B^{-1}\mathbf{b}\]

System Reformulation and Reduced Costs

Introducing \(\bar{\mathbf{A}}\) and \(\bar{\mathbf{b}}\)

Let \(\bar{\mathbf{A}}= \mathbf{A}_B^{-1}\mathbf{A}_N\) and \(\bar{\mathbf{b}}= \mathbf{A}_B^{-1}\mathbf{b}\). The system equation simplifies to: \[\mathbf{x}_B+ \bar{\mathbf{A}}\mathbf{x}_N= \bar{\mathbf{b}}\] which allows us to express basic variables in terms of non-basic variables: \[\mathbf{x}_B= \bar{\mathbf{b}}- \bar{\mathbf{A}}\mathbf{x}_N\]

Objective Function in Terms of Non-Basic Variables

Substitute \(\mathbf{x}_B= \bar{\mathbf{b}}- \bar{\mathbf{A}}\mathbf{x}_N\) into the objective function \(Z = \mathbf{c}^T \mathbf{x}= \mathbf{c}_B^T \mathbf{x}_B+ \mathbf{c}_N^T \mathbf{x}_N\): $$ \[\begin{aligned} Z &= \mathbf{c}_B^T (\bar{\mathbf{b}}- \bar{\mathbf{A}}\mathbf{x}_N) + \mathbf{c}_N^T \mathbf{x}_N\\ &= \mathbf{c}_B^T \bar{\mathbf{b}}- \mathbf{c}_B^T \bar{\mathbf{A}}\mathbf{x}_N+ \mathbf{c}_N^T \mathbf{x}_N\\ &= \mathbf{c}_B^T \bar{\mathbf{b}}+ (\mathbf{c}_N^T - \mathbf{c}_B^T \bar{\mathbf{A}}) \mathbf{x}_N \end{aligned}\]

$$

Defining Reduced Cost Vector \(\bar{\mathbf{c}}_N\)

Define the reduced cost vector \(\bar{\mathbf{c}}_N^T\) for non-basic variables as: \[\bar{\mathbf{c}}_N^T = \mathbf{c}_N^T - \mathbf{c}_B^T \bar{\mathbf{A}}= \mathbf{c}_N^T - \mathbf{c}_B^T \mathbf{A}_B^{-1}\mathbf{A}_N\] The objective function now becomes: \[Z = \mathbf{c}_B^T \bar{\mathbf{b}}+ \bar{\mathbf{c}}_N^T \mathbf{x}_N\] The term \(\mathbf{c}_B^T \bar{\mathbf{b}}\) is a constant with respect to \(\mathbf{x}_N\).

The reduced cost vector \(\bar{\mathbf{c}}_N^T\) for non-basic variables is defined as: \[\bar{\mathbf{c}}_N^T = \mathbf{c}_N^T - \mathbf{c}_B^T \bar{\mathbf{A}}= \mathbf{c}_N^T - \mathbf{c}_B^T \mathbf{A}_B^{-1}\mathbf{A}_N\]

Canonical Form Subproblem

The original maximization problem is now reformulated as: $$ \[\begin{aligned} \text{maximize } & Z = \mathbf{c}_B^T \bar{\mathbf{b}}+ \bar{\mathbf{c}}_N^T \mathbf{x}_N\\ \text{subject to } & \mathbf{x}_B= \bar{\mathbf{b}}- \bar{\mathbf{A}}\mathbf{x}_N\geq \mathbf{0} \\ & \mathbf{x}_N\geq \mathbf{0} \end{aligned}\] \[ This is equivalent to the canonical form subproblem in terms of $\mathbf{x}_N$: \] \[\begin{aligned} \text{maximize } & \bar{\mathbf{c}}_N^T \mathbf{x}_N\\ \text{subject to } & \bar{\mathbf{A}}\mathbf{x}_N\leq \bar{\mathbf{b}}\\ & \mathbf{x}_N\geq \mathbf{0} \end{aligned}\]

$$ where the constant term \(\mathbf{c}_B^T \bar{\mathbf{b}}\) is omitted from the objective function for optimization purposes, as it does not affect the optimal \(\mathbf{x}_N\).

The original maximization problem can be reformulated as a canonical form subproblem in terms of non-basic variables \(\mathbf{x}_N\): $$ \[\begin{aligned} \text{maximize } & \bar{\mathbf{c}}_N^T \mathbf{x}_N\\ \text{subject to } & \bar{\mathbf{A}}\mathbf{x}_N\leq \bar{\mathbf{b}}\\ & \mathbf{x}_N\geq \mathbf{0} \end{aligned}\]

$$ where \(\bar{\mathbf{A}}= \mathbf{A}_B^{-1}\mathbf{A}_N\) and \(\bar{\mathbf{b}}= \mathbf{A}_B^{-1}\mathbf{b}\).

Basis Dependence and Solution Recovery

The subproblem is defined with respect to the chosen basis \(\mathcal{B}\). For each basis, we get a different subproblem. If \(\mathbf{x}_N^*\) is an optimal solution to this subproblem, the optimal solution to the original problem is recovered by setting \(\mathbf{x}_N= \mathbf{x}_N^*\) and calculating \(\mathbf{x}_B^* = \bar{\mathbf{b}}- \bar{\mathbf{A}}\mathbf{x}_N^*\). The optimal objective value for the original problem is \(Z^* = \mathbf{c}_B^T \bar{\mathbf{b}}+ \bar{\mathbf{c}}_N^T \mathbf{x}_N^*\).

Connection to Echelon Form

Echelon Form in Linear Systems

The transformation from \(\mathbf{A}\mathbf{x}= \mathbf{b}\) to \(\mathbf{x}_B+ \bar{\mathbf{A}}\mathbf{x}_N= \bar{\mathbf{b}}\) is analogous to transforming a system of linear equations into reduced row-echelon form using Gaussian elimination. In reduced row-echelon form, leading variables are isolated, and their coefficients become identity matrices.

Gaussian Elimination and Reduced Echelon Form

Gaussian elimination involves elementary row operations to transform a matrix into row-echelon form. Further reduction to reduced row-echelon form involves making the leading entry in each non-zero row (pivot) equal to 1 and ensuring that all other entries in the column containing a pivot are zero.

Echelon Form and Basic Variables

In our context, the transformation to \(\mathbf{x}_B+ \bar{\mathbf{A}}\mathbf{x}_N= \bar{\mathbf{b}}\) effectively uses the basis \(\mathbf{A}_B\) to perform a change of basis, analogous to achieving a form where the coefficients of the basic variables form an identity matrix. This is a form of echelon form adapted for linear programming, where we explicitly solve for basic variables in terms of non-basic variables, facilitating the Simplex algorithm’s iterative process.

The transformation to \(\mathbf{x}_B+ \bar{\mathbf{A}}\mathbf{x}_N= \bar{\mathbf{b}}\) is analogous to transforming a system of linear equations into reduced row-echelon form using Gaussian elimination.

Conclusion

In this lecture, we have successfully transformed a linear programming problem from standard form into a canonical form subproblem, a transformation predicated on the selection of a basis. We defined and utilized the concepts of basic and non-basic variables, illustrating how system reformulation via basis inversion allows us to express basic variables in terms of non-basic variables. The critical insight is that by choosing a basis, we can re-express the original problem to emphasize the role of non-basic variables in influencing both the objective function and the constraints. This reformulation is a preparatory step for the iterative Simplex Algorithm.

In future lectures, we will address the subsequent crucial questions:

  • How is an initial basis selected?

  • What is the iterative process for basis improvement to achieve an optimal solution?

  • What criteria determine optimality and algorithm termination within the Simplex Algorithm?

These questions will guide our exploration into the detailed algorithmic steps of the Simplex method.