Models of Classic Combinatorial Optimization Problems
Introduction
This document presents models for several classic combinatorial optimization problems, including the Knapsack problem, the Set Covering problem, the Independent Set problem, and the Satisfiability problem (SAT). These problems are fundamental in combinatorial optimization and appear in various contexts. Understanding their models is crucial for solving them in different scenarios. The main objectives are to introduce these problems, formulate them using linear integer programming, and discuss their key characteristics.
Knapsack Problem: Involves selecting items with weights and values to maximize the total value without exceeding a given capacity.
Set Covering Problem: Aims to find a minimum-cost collection of sets that covers all elements of a universe.
Independent Set Problem: Focuses on selecting a set of non-adjacent vertices in a graph, often to maximize the total weight or cardinality of the set.
Satisfiability Problem (SAT): Concerns finding a truth assignment to boolean variables that satisfies a set of logical clauses, often maximizing the number of satisfied clauses.
The Knapsack Problem
Definition 1 (Knapsack Problem). The Knapsack problem involves a knapsack with a capacity \(C\) and \(N\) objects. Each object \(i\) has a profit \(P_i\) and a weight \(W_i\). The goal is to select a subset of objects to maximize the total profit without exceeding the knapsack’s capacity.
There are two main versions of the Knapsack problem:
Binary Knapsack (0-1 Knapsack): Each object can either be selected or not.
Integer Knapsack: Multiple copies of each object can be selected.
We will focus on the Binary Knapsack problem.
Binary Knapsack Model
Definition 2 (Binary Knapsack Model). Let \(X_i\) be a binary variable for each object \(i\), where \(X_i = 1\) if object \(i\) is selected, and \(X_i = 0\) otherwise. The objective is to maximize the total profit, subject to the constraint that the total weight of selected objects does not exceed the knapsack’s capacity.
The model can be formulated as follows: \[\begin{aligned} \text{Maximize} \quad & \sum_{i=1}^N P_i X_i \\ \text{Subject to} \quad & \sum_{i=1}^N W_i X_i \leq C \\ & X_i \in \{0, 1\}, \quad \forall i \in \{1, \dots, N\} \end{aligned}\]
Objective Function: Maximize the total profit of selected objects.
Capacity Constraint: The total weight of selected objects must not exceed the knapsack’s capacity \(C\). This can be represented as: \[\sum_{i=1}^N W_i X_i \leq C\]
Binary Variables: Each \(X_i\) is a binary variable, indicating whether object \(i\) is selected (\(X_i = 1\)) or not (\(X_i = 0\)).
Example 1 (Binary Knapsack). Suppose we have a knapsack with capacity \(C = 10\) and 4 objects with the following profits and weights:
Object 1: \(P_1 = 6\), \(W_1 = 5\)
Object 2: \(P_2 = 5\), \(W_2 = 4\)
Object 3: \(P_3 = 4\), \(W_3 = 3\)
Object 4: \(P_4 = 3\), \(W_4 = 2\)
We can select objects 1 and 4 for a total profit of 9 and a total weight of 7, which does not exceed the capacity.
The Set Covering Problem
Definition 3 (Set Covering Problem). Given a universe \(U = \{1, 2, \dots, M\}\) and a family of subsets \(S_1, S_2, \dots, S_N\) where each \(S_i \subseteq U\), each with a cost \(C_i\). The goal is to find a subset of these subsets (a "cover") such that their union is \(U\), and the total cost of the selected subsets is minimized.
Set Covering Model
Definition 4 (Set Covering Model). Let \(X_i\) be a binary variable for each subset \(S_i\), where \(X_i = 1\) if \(S_i\) is selected, and \(X_i = 0\) otherwise. The objective is to minimize the total cost of selected subsets, subject to the constraint that each element in the universe \(U\) is covered by at least one selected subset.
The model can be formulated as follows: \[\begin{aligned} \text{Minimize} \quad & \sum_{i=1}^N C_i X_i \\ \text{Subject to} \quad & \sum_{j: i \in S_j} X_j \geq 1, \quad \forall i \in \{1, \dots, M\} \\ & X_i \in \{0, 1\}, \quad \forall i \in \{1, \dots, N\} \end{aligned}\]
Objective Function: Minimize the total cost of selected subsets.
Covering Constraints: Each element \(i\) in the universe \(U\) must be covered by at least one selected subset. This is represented by the constraint: \[\sum_{j: i \in S_j} X_j \geq 1, \quad \forall i \in \{1, \dots, M\}\]
Binary Variables: Each \(X_i\) is a binary variable, indicating whether subset \(S_i\) is selected (\(X_i = 1\)) or not (\(X_i = 0\)).
Example Instance
Consider a universe \(U = \{1, 2, 3, 4, 5\}\) and five subsets:
\(S_1 = \{1, 3\}\), Cost \(C_1 = 5\)
\(S_2 = \{1, 2, 3\}\), Cost \(C_2 = 3\)
\(S_3 = \{1, 4\}\), Cost \(C_3 = 3\)
\(S_4 = \{3, 5\}\), Cost \(C_4 = 2\)
\(S_5 = \{2, 4, 5\}\), Cost \(C_5 = 4\)
Example 2 (Set Covering Example Model). The model for this instance is: \[\begin{aligned} \text{Minimize} \quad & 5X_1 + 3X_2 + 3X_3 + 2X_4 + 4X_5 \\ \text{Subject to} \quad & X_1 + X_2 + X_3 \geq 1 \quad \text{(covering element 1)} \\ & X_2 + X_5 \geq 1 \quad \text{(covering element 2)} \\ & X_1 + X_2 + X_4 \geq 1 \quad \text{(covering element 3)} \\ & X_3 + X_5 \geq 1 \quad \text{(covering element 4)} \\ & X_4 + X_5 \geq 1 \quad \text{(covering element 5)} \\ & X_i \in \{0, 1\}, \quad \forall i \in \{1, \dots, 5\} \end{aligned}\] A possible solution is to select \(S_2\), \(S_3\) and \(S_4\) for a total cost of \(3 + 3 + 2 = 8\). This covers all elements in \(U\).
The Independent Set Problem
Definition 5 (Independent Set Problem). Given a graph \(G = (V, E)\) with vertices \(V = \{1, 2, \dots, N\}\) and edges \(E\), and each vertex \(i\) having a weight \(W_i\). The goal is to find a subset of vertices such that no two vertices in the subset are adjacent, and the total weight of the selected vertices is maximized. This is also known as the Maximum Weighted Independent Set Problem.
Independent Set Model
Definition 6 (Independent Set Model). Let \(X_i\) be a binary variable for each vertex \(i\), where \(X_i = 1\) if vertex \(i\) is selected, and \(X_i = 0\) otherwise. The objective is to maximize the total weight of selected vertices, subject to the constraint that no two adjacent vertices are selected.
The model can be formulated as follows: \[\begin{aligned} \text{Maximize} \quad & \sum_{i=1}^N W_i X_i \\ \text{Subject to} \quad & X_i + X_j \leq 1, \quad \forall (i, j) \in E \\ & X_i \in \{0, 1\}, \quad \forall i \in \{1, \dots, N\} \end{aligned}\]
Objective Function: Maximize the total weight of selected vertices.
Independence Constraints: For each edge \((i, j) \in E\), at most one of the vertices \(i\) and \(j\) can be selected. This is represented by the constraint: \[X_i + X_j \leq 1, \quad \forall (i, j) \in E\]
Binary Variables: Each \(X_i\) is a binary variable, indicating whether vertex \(i\) is selected (\(X_i = 1\)) or not (\(X_i = 0\)).
Relationship with Maximum Clique Problem
Definition 7 (Complement Graph). The complement graph \(G' = (V, E')\) of a graph \(G = (V, E)\) has the same vertex set \(V\), and an edge \((i, j) \in E'\) if and only if \((i, j) \notin E\).
Theorem 1 (Maximum Clique and Independent Set). Finding a maximum clique in a graph \(G\) is equivalent to finding a maximum independent set in its complement graph \(G'\).
Example Instance
Consider a graph with 5 vertices and the following weights and edges:
Weights: \(W_1 = 2, W_2 = 3, W_3 = 4, W_4 = 1, W_5 = 5\)
Edges: \(E = \{(1, 2), (1, 3), (2, 4), (3, 4), (4, 5)\}\)
Example 3 (Independent Set). In this example, we can select vertices 1, 4, and 5 for a total weight of \(2 + 1 + 5 = 8\). This is a maximum independent set.
The Satisfiability Problem (SAT)
Definition 8 (Satisfiability Problem). Given a set of \(M\) clauses \(C_1, C_2, \dots, C_M\) over \(N\) Boolean variables \(X_1, X_2, \dots, X_N\), where each clause is a disjunction (OR) of literals (a variable or its negation). The goal is to find an assignment of truth values to the variables that satisfies as many clauses as possible. This is often referred to as the Maximum Satisfiability Problem (MAX-SAT).
SAT Model
Definition 9 (SAT Model). Let \(x_i\) be a binary variable for each Boolean variable \(X_i\), where \(x_i = 1\) if \(X_i\) is true, and \(x_i = 0\) if \(X_i\) is false. Let \(Y_j\) be a binary variable for each clause \(C_j\), where \(Y_j = 1\) if clause \(C_j\) is satisfied, and \(Y_j = 0\) otherwise.
The model can be formulated as follows: \[\begin{aligned} \text{Maximize} \quad & \sum_{j=1}^M Y_j \\ \text{Subject to} \quad & \sum_{i \in C_j^+} x_i + \sum_{i \in C_j^-} (1 - x_i) \geq Y_j, \quad \forall j \in \{1, \dots, M\} \\ & x_i \in \{0, 1\}, \quad \forall i \in \{1, \dots, N\} \\ & Y_j \in \{0, 1\}, \quad \forall j \in \{1, \dots, M\} \end{aligned}\] where \(C_j^+\) is the set of variables appearing positively in clause \(C_j\), and \(C_j^-\) is the set of variables appearing negated in clause \(C_j\).
Objective Function: Maximize the number of satisfied clauses.
Clause Satisfaction Constraints: For each clause \(C_j\), \(Y_j\) can be 1 only if at least one literal in the clause is true. The constraint \[\sum_{i \in C_j^+} x_i + \sum_{i \in C_j^-} (1 - x_i) \geq Y_j\] ensures that \(Y_j\) is 0 if the clause is not satisfied. Since the objective function maximizes the sum of \(Y_j\), the model will try to set \(Y_j\) to 1 whenever possible.
Binary Variables:
\(x_i\): Binary variable indicating the truth value of Boolean variable \(X_i\).
\(Y_j\): Binary variable indicating whether clause \(C_j\) is satisfied (\(Y_j = 1\)) or not (\(Y_j = 0\)).
Example Instance
Consider a SAT instance with 3 Boolean variables \(X_1, X_2, X_3\) and 4 clauses:
\(C_1 = X_1 \lor \neg X_2\)
\(C_2 = \neg X_1 \lor X_3\)
\(C_3 = X_2 \lor X_3\)
\(C_4 = \neg X_1 \lor \neg X_3\)
Example 4 (SAT Example Model). The model for this instance is: \[\begin{aligned} \text{Maximize} \quad & Y_1 + Y_2 + Y_3 + Y_4 \\ \text{Subject to} \quad & x_1 + (1 - x_2) \geq Y_1 \\ & (1 - x_1) + x_3 \geq Y_2 \\ & x_2 + x_3 \geq Y_3 \\ & (1 - x_1) + (1 - x_3) \geq Y_4 \\ & x_1, x_2, x_3 \in \{0, 1\} \\ & Y_1, Y_2, Y_3, Y_4 \in \{0, 1\} \end{aligned}\] A possible solution is \(x_1 = 1, x_2 = 1, x_3 = 0\), which satisfies clauses \(C_1, C_2\), and \(C_3\). In this case, \(Y_1 = 1, Y_2 = 1, Y_3 = 1, Y_4 = 0\), and the objective function value is 3.
Conclusion
This document has presented models for several classic combinatorial optimization problems: the Knapsack problem, the Set Covering problem, the Independent Set problem, and the Satisfiability problem. These models are formulated using linear integer programming, providing a foundation for solving these problems in various applications. Key aspects include defining appropriate binary variables, formulating objective functions to maximize or minimize specific quantities, and establishing constraints to ensure feasibility and correctness of solutions.
The Knapsack problem involves selecting items to maximize profit without exceeding capacity.
The Set Covering problem aims to cover all elements in a universe with the minimum cost of selected subsets.
The Independent Set problem seeks to select non-adjacent vertices in a graph to maximize total weight.
The Satisfiability problem involves finding a truth assignment to Boolean variables to satisfy as many clauses as possible.
How can the Integer Knapsack problem be modeled?
What are other applications of the Set Covering problem?
How is the Maximum Clique problem related to the Independent Set problem?
Can you provide an example of a SAT instance and its corresponding model?
Further Exploration:
Integer Knapsack Model: In the Integer Knapsack problem, multiple copies of each item can be selected. The model can be formulated by allowing the variables \(X_i\) to take non-negative integer values instead of being binary.
Applications of Set Covering: The Set Covering problem has applications in various fields such as facility location, crew scheduling, and test case selection in software engineering.
Maximum Clique and Independent Set: The Maximum Clique problem is equivalent to the Independent Set problem in the complement graph. This relationship allows algorithms for one problem to be used for the other.
SAT Instance Example: An example of a SAT instance and its corresponding model is provided in Section 5.