Randomized Min-Cut Algorithm and Conditional Probability
Introduction
This lecture introduces the fundamental concept of conditional probability and demonstrates its application in the analysis of a randomized algorithm designed to solve the minimum cut (Min-Cut) problem in a graph. We begin by revisiting the definition and properties of conditional probability, including the notion of independent events. Subsequently, we explore a simple yet effective randomized algorithm for the Min-Cut problem, based on edge contraction. A significant portion of the lecture is dedicated to analyzing the probability of success of this algorithm, providing insights into the interplay between randomness and algorithm design.
Conditional Probability and Independent Events
Conditional Probability
Definition 1 (Conditional Probability). In probability theory, the conditional probability of an event \(A\) occurring given that another event \(B\) has already occurred is denoted as \(P(A|B)\), and is defined as: \[P(A|B) = \frac{P(A \cap B)}{P(B)}\label{eq:conditional_probability_definition}\] provided that \(P(B) > 0\).
This definition formalizes how the likelihood of event \(A\) is updated when we gain information that event \(B\) has occurred. Intuitively, conditional probability allows us to focus on a reduced sample space, specifically the event \(B\), and calculate the probability of \(A\) within this new context. It answers the question: "Given that \(B\) has happened, what is the chance of \(A\) happening as well?".
Example 1. Consider rolling a fair six-sided die. Let \(A\) be the event of rolling an even number, and \(B\) be the event of rolling a number less than or equal to 3. We want to find \(P(A|B)\), the probability of rolling an even number given that the number rolled is less than or equal to 3.
The event \(A = \{2, 4, 6\}\) and \(B = \{1, 2, 3\}\). The intersection \(A \cap B = \{2\}\). Assuming each outcome is equally likely, we have the probabilities: \(P(A) = \frac{3}{6} = \frac{1}{2}\), \(P(B) = \frac{3}{6} = \frac{1}{2}\), and \(P(A \cap B) = \frac{1}{6}\). Using the formula for conditional probability [eq:conditional_probability_definition]: \[P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{1/6}{1/2} = \frac{1}{3}\] Thus, given that we rolled a number less than or equal to 3, the probability that it is even is \(\frac{1}{3}\). This is different from the unconditional probability of rolling an even number, which is \(P(A) = \frac{1}{2}\). The condition \(B\) has changed the probability of \(A\).
Independent Events
Definition 2 (Independent Events). Two events, \(E_1\) and \(E_2\), are said to be independent if the occurrence of one does not affect the probability of the occurrence of the other. Mathematically, events \(E_1\) and \(E_2\) are independent if and only if: \[P(E_1 \cap E_2) = P(E_1) \cdot P(E_2)\label{eq:independent_events_definition}\]
Independence is a crucial concept in probability, simplifying calculations and providing insights into probabilistic models. If two events are independent, knowing that one has occurred gives no information about whether the other will occur.
Example 2. Consider flipping a fair coin twice. Let \(E_1\) be the event that the first flip is heads, and \(E_2\) be the event that the second flip is heads. The outcome of the first flip does not influence the outcome of the second flip. The probability of getting heads on a single flip is \(P(H) = \frac{1}{2}\). The probability of both flips being heads is \(P(E_1 \cap E_2) = P(\text{First flip is H and Second flip is H}) = \frac{1}{4}\). Using the definition of independent events [eq:independent_events_definition], we check if \(P(E_1 \cap E_2) = P(E_1) \cdot P(E_2)\). \(P(E_1) = \frac{1}{2}\) (probability of first flip being heads) \(P(E_2) = \frac{1}{2}\) (probability of second flip being heads) \(P(E_1) \cdot P(E_2) = \frac{1}{2} \cdot \frac{1}{2} = \frac{1}{4} = P(E_1 \cap E_2)\). Thus, \(E_1\) and \(E_2\) are independent events.
In contrast, consider drawing two cards without replacement from a deck. Let \(F_1\) be the event that the first card is an Ace, and \(F_2\) be the event that the second card is an Ace. These events are not independent. The probability of \(F_2\) depends on whether \(F_1\) occurred. If the first card drawn was an Ace, then there are fewer Aces left in the deck, affecting the probability of drawing an Ace as the second card.
For independent events, the conditional probability simplifies significantly, as shown in the following proposition.
Proposition 1. If \(E_1\) and \(E_2\) are independent events, then: \[P(E_1 | E_2) = P(E_1) \quad \text{and} \quad P(E_2 | E_1) = P(E_2)\label{eq:conditional_probability_independent}\]
Proof. Proof. If \(E_1\) and \(E_2\) are independent, then by definition \(P(E_1 \cap E_2) = P(E_1) \cdot P(E_2)\). Assuming \(P(E_2) > 0\), we can compute the conditional probability \(P(E_1 | E_2)\) using the definition of conditional probability [eq:conditional_probability_definition]: \[P(E_1 | E_2) = \frac{P(E_1 \cap E_2)}{P(E_2)} = \frac{P(E_1) \cdot P(E_2)}{P(E_2)} = P(E_1)\] Similarly, assuming \(P(E_1) > 0\), we have \(P(E_2 | E_1) = P(E_2)\). This confirms that for independent events, conditioning on one event does not change the probability of the other. ◻
Probability of Intersection of Events
For any two events \(E_1\) and \(E_2\), the probability of their intersection can be expressed using conditional probability as: \[P(E_1 \cap E_2) = P(E_1 | E_2) \cdot P(E_2) = P(E_2 | E_1) \cdot P(E_1)\label{eq:intersection_two_events}\] This is a direct rearrangement of the definition of conditional probability [eq:conditional_probability_definition] and is particularly useful when it is easier to calculate the conditional probability and the probability of one of the events than to directly calculate the probability of their intersection.
This concept generalizes to the intersection of multiple events \(E_1, E_2, \ldots, E_k\).
Theorem 2 (Chain Rule for Conditional Probabilities). For events \(E_1, E_2, \ldots, E_k\), the probability of their intersection is given by: \[P\left(\bigcap_{i=1}^{k} E_i\right) = P(E_1) \cdot P(E_2 | E_1) \cdot P(E_3 | E_1 \cap E_2) \cdots P\left(E_k | \bigcap_{i=1}^{k-1} E_i\right)\label{eq:chain_rule}\] Description: This theorem, known as the Chain Rule or Product Rule for conditional probabilities, provides a method to calculate the probability of the intersection of multiple events by breaking it down into a product of conditional probabilities. Each term in the product is the probability of an event occurring given that all preceding events have already occurred.
The chain rule [eq:chain_rule] is a fundamental tool for calculating the probability of the joint occurrence of multiple events, especially when these events are not independent. It decomposes the joint probability into a product of conditional probabilities, allowing us to calculate complex probabilities step-by-step. Each term in the product represents the probability of a new event occurring given that all preceding events have already occurred. This rule is essential in scenarios where events unfold sequentially, and the outcome of each event depends on the outcomes of previous events.
Example 3. Consider drawing three cards without replacement from a deck. Let \(F_1\) be the event that the first card is an Ace, \(F_2\) be the event that the second card is an Ace, and \(F_3\) be the event that the third card is an Ace. We want to find the probability of all three events occurring, i.e., \(P(F_1 \cap F_2 \cap F_3)\). Using the chain rule [eq:chain_rule] for \(k=3\): \(P(F_1 \cap F_2 \cap F_3) = P(F_1) \cdot P(F_2 | F_1) \cdot P(F_3 | F_1 \cap F_2)\).
\(P(F_1) = \frac{4}{52}\) (probability that the first card is an Ace, as there are 4 Aces in a deck of 52 cards).
\(P(F_2 | F_1) = \frac{3}{51}\) (given that the first card was an Ace, there are now 3 Aces left in the remaining 51 cards).
\(P(F_3 | F_1 \cap F_2) = \frac{2}{50}\) (given that the first two cards were Aces, there are now 2 Aces left in the remaining 50 cards).
Therefore, \(P(F_1 \cap F_2 \cap F_3) = \frac{4}{52} \cdot \frac{3}{51} \cdot \frac{2}{50} = \frac{24}{132600} = \frac{1}{5525}\).
The Min-Cut Problem and a Randomized Algorithm
The Min-Cut Problem
Definition 3 (Cut). Given a connected, undirected graph \(G = (V, E)\), a cut is a partition of the vertices \(V\) into two disjoint sets, say \(S\) and \(V \setminus S\). The pair \((S, V \setminus S)\) represents a cut.
Definition 4 (Cut-set and Size). The cut-set of a cut \((S, V \setminus S)\) is the set of edges that have one endpoint in \(S\) and the other endpoint in \(V \setminus S\). The size of the cut is the number of edges in the cut-set.
Definition 5 (Min-Cut Problem). The Min-Cut problem is to find a cut in a given connected, undirected graph \(G\) with the minimum size.
Finding a minimum cut is crucial in network analysis, with applications ranging from assessing network reliability to image segmentation and community detection in graphs. For instance, in network reliability, the min-cut can represent the weakest link in a network, determining the minimum number of connections that need to fail to disconnect the network. In image segmentation, partitioning an image into foreground and background can be formulated as a min-cut problem on a graph representing the image pixels. Furthermore, in community detection, finding sparse cuts can help identify clusters of densely connected nodes in social or biological networks.
Randomized Min-Cut Algorithm: Edge Contraction
We now introduce a randomized algorithm to find a minimum cut, based on the principle of edge contraction. This algorithm, while simple, provides a powerful illustration of how randomness can be used to solve combinatorial problems.
Edge Contraction
Definition 6 (Edge Contraction). Contracting an edge \(e = (u, v)\) in a graph \(G\) involves the following operations:
Merge vertices \(u\) and \(v\) into a single new vertex, say \(uv\).
Replace every edge incident to either \(u\) or \(v\) with an edge incident to the new vertex \(uv\). If there are multiple edges between \(u\) or \(v\) and another vertex \(w\), all these edges become edges between \(uv\) and \(w\).
Remove any self-loops that arise (edges with both endpoints at \(uv\)).
Multiple edges between pairs of vertices are preserved during contraction.
Figure 1 illustrates the process of edge contraction. When edge \(e=(1,2)\) is contracted, vertices 1 and 2 merge into a new vertex labeled ‘1,2’. Edges originally incident to either vertex 1 or vertex 2 are now incident to ‘1,2’. Importantly, if there were multiple edges between vertex 2 and vertex 4, they both become edges between the new vertex ‘1,2’ and vertex 4. Self-loops, which might be created if there was an edge between 1 and 2 initially (which is the edge being contracted, and thus does not create a self-loop in this specific algorithm step, but could in other contraction scenarios), are removed.
Algorithm Outline
The randomized Min-Cut algorithm iteratively contracts randomly chosen edges until only two vertices remain. The cut is then defined by the edges connecting these two vertices.
Randomly select an edge \(e \in E\) uniformly at random. Contract edge \(e\) in \(G\). Return the number of edges between the two remaining vertices.
Complexity Analysis: In each iteration of the while loop, the number of vertices decreases by one. If we represent the graph using adjacency lists, selecting and contracting an edge can be done in time roughly proportional to the degree of the vertices being merged. A naive implementation might lead to \(O(|V|^2)\) time per contraction. Since there are \(n-2\) contractions in total, a single run of this algorithm can be implemented in \(O(|V|^2 \cdot |E|)\) or more efficiently with careful data structure choices to around \(O(|V|^2 \log |V|)\) or even closer to \(O(|E| \log |V|)\) using more advanced techniques. For simplicity, we consider a bound of roughly \(O(n^2)\) per contraction for basic analysis, leading to an overall complexity for a single run of approximately \(O(n^3)\) or \(O(n^2 m)\) depending on implementation details.
To enhance the probability of finding the actual minimum cut, Algorithm [alg:randomized_min_cut] is typically executed multiple times. The minimum cut size found across all repetitions is then taken as the result.
Probability Analysis
Let \(C\) be a minimum cut in the original graph \(G\), and let \(k = |C|\) be its size. We want to determine the probability that a single run of the randomized algorithm successfully finds this minimum cut \(C\).
Let \(E_i\) be the event that in the \(i\)-th step of Algorithm [alg:randomized_min_cut], we do not contract an edge belonging to the minimum cut \(C\). If, throughout the algorithm, we never contract an edge from \(C\), the cut obtained at the end will indeed be a minimum cut (or possibly another minimum cut if multiple exist).
The probability of finding a min-cut is thus the probability of the intersection of events \(E_1, E_2, \ldots, E_{n-2}\), where \(n\) is the initial number of vertices in \(G\). Using the chain rule for conditional probabilities: \[P\left(\bigcap_{i=1}^{n-2} E_i\right) = P(E_1) \cdot P(E_2 | E_1) \cdot P(E_3 | E_1 \cap E_2) \cdots P\left(E_{n-2} | \bigcap_{i=1}^{n-3} E_i\right)\]
To evaluate this probability, we need to estimate \(P(E_i | \bigcap_{j=1}^{i-1} E_j)\). Consider the \(i\)-th contraction step. Assume that in all preceding \(i-1\) steps, no edge from the minimum cut \(C\) has been contracted. Let \(G_{i-1}\) be the graph after \(i-1\) contractions. The minimum cut in \(G_{i-1}\) is still of size \(k\), because contracting non-cut edges does not reduce the size of the min-cut.
In any graph, the minimum degree of a vertex is a lower bound on the min-cut size. In fact, the sum of degrees is twice the number of edges, and if the minimum cut is \(k\), every vertex must have degree at least \(k\). Therefore, if \(G_{i-1} = (V_{i-1}, E_{i-1})\) has \(n-(i-1)\) vertices, the sum of degrees is at least \((n-i+1)k\). Since the sum of degrees is \(2|E_{i-1}|\), the number of edges \(|E_{i-1}| \ge \frac{(n-i+1)k}{2}\).
The probability of selecting an edge from the minimum cut \(C\) in the \(i\)-th step, given that no edge from \(C\) has been contracted so far, is at most the ratio of the number of cut edges to the total number of edges in \(G_{i-1}\). Since the number of edges in the min-cut \(C\) remains \(k\), and the total number of edges in \(G_{i-1}\) is at least \(\frac{(n-i+1)k}{2}\), this probability is bounded by: \(\frac{k}{|E_{i-1}|} \le \frac{k}{(n-i+1)k/2} = \frac{2}{n-i+1}\). Thus, the probability of not selecting an edge from the minimum cut in the \(i\)-th step is \(P(E_i | \bigcap_{j=1}^{i-1} E_j) \ge 1 - \frac{2}{n-i+1}\).
Therefore, the probability of finding a min-cut in a single run is: \[\begin{aligned}P\left(\bigcap_{i=1}^{n-2} E_i\right) &= \prod_{i=1}^{n-2} P(E_i | \bigcap_{j=1}^{i-1} E_j) \\&\ge \prod_{i=1}^{n-2} \left(1 - \frac{2}{n-i+1}\right) \\&= \left(1 - \frac{2}{n}\right) \cdot \left(1 - \frac{2}{n-1}\right) \cdots \left(1 - \frac{2}{3}\right) \\&= \left(\frac{n-2}{n}\right) \cdot \left(\frac{n-3}{n-1}\right) \cdots \left(\frac{1}{3}\right) \\&= \frac{(n-2) \cdot (n-3) \cdots 1}{n \cdot (n-1) \cdots 3} = \frac{(n-2)!}{n! / (1 \cdot 2)} = \frac{2(n-2)!}{n!} = \frac{2}{n(n-1)}\end{aligned}\] Hence, the probability that a single run of Algorithm [alg:randomized_min_cut] finds a minimum cut is at least \(\frac{2}{n(n-1)} = \Omega\left(\frac{1}{n^2}\right)\).
Increasing Success Probability
To amplify the success probability, we can repeat the randomized Min-Cut algorithm multiple times and choose the smallest cut found. Let \(p \ge \frac{2}{n(n-1)}\) be the probability of success in a single run. If we perform \(h\) independent runs, the probability of failing to find a minimum cut in any of these runs is \((1-p)^h\). Consequently, the probability of finding a minimum cut in at least one run is \(1 - (1-p)^h\).
We want to determine how many repetitions \(h\) are needed to achieve a high probability of success. Suppose we aim for a success probability of at least \(1 - \delta\), where \(\delta\) is a small failure probability (e.g., \(\delta = 1/n^2\) or a constant like \(1/e\)). We need to find \(h\) such that: \(1 - (1-p)^h \ge 1 - \delta\) \((1-p)^h \le \delta\) Taking the natural logarithm of both sides, we get \(h \ln(1-p) \le \ln(\delta)\). For small \(p\), we can use the approximation \(\ln(1-p) \approx -p\). Thus, \(-hp \lesssim \ln(\delta)\), or \(h \gtrsim \frac{-\ln(\delta)}{p}\).
Using the lower bound \(p \ge \frac{2}{n(n-1)}\), we can set \(h = \frac{n(n-1)}{2} \ln\left(\frac{1}{\delta}\right)\). For a constant success probability, say \(1 - 1/e\), we set \(\delta = 1/e\), so \(\ln(1/\delta) = 1\). Then, \(h \approx \frac{n(n-1)}{2} = O(n^2)\) runs are sufficient. For a very high success probability, e.g., \(1 - 1/n^c\) for some constant \(c\), we set \(\delta = 1/n^c\), so \(\ln(1/\delta) = c \ln(n)\). In this case, \(h = O(n^2 \log n)\) repetitions are sufficient.
Trade-off: Complexity vs. Probability of Error
The randomized Min-Cut algorithm presents a trade-off between computational complexity and the probability of error.
Computational Complexity: As discussed in Algorithm [alg:randomized_min_cut], a single run has a time complexity of roughly \(O(n^2 m)\) or potentially better with optimized implementations. If we repeat the algorithm \(h\) times, the total time complexity becomes \(h\) times the complexity of a single run. For \(h = O(n^2)\), the total complexity is around \(O(n^4 m)\) or \(O(n^4)\). For \(h = O(n^2 \log n)\), the complexity is \(O(n^4 m \log n)\) or \(O(n^4 \log n)\).
Probability of Error: By increasing the number of repetitions \(h\), we can reduce the probability of not finding a minimum cut to an arbitrarily small value. This allows us to trade computational time for increased confidence in the result.
Conclusion
In this lecture, we have explored the concept of conditional probability and its powerful application in analyzing a randomized algorithm for the Min-Cut problem. We began by defining conditional probability and independent events, establishing the theoretical groundwork for our analysis. We then introduced a simple randomized algorithm for finding a minimum cut in a graph based on the principle of edge contraction. This algorithm stands out for its simplicity and ease of implementation, offering an alternative to more complex deterministic approaches.
A significant portion of the lecture was dedicated to the probabilistic analysis of the randomized Min-Cut algorithm. By applying the chain rule for conditional probabilities, we derived a lower bound on the probability that a single run of the algorithm finds a minimum cut. We found that this probability, while seemingly small at \(\Omega\left(\frac{1}{n^2}\right)\), is non-zero and can be significantly amplified by repeating the algorithm multiple times.
We discussed the trade-off between computational complexity and the probability of error inherent in this randomized approach. While a single run of the algorithm is computationally efficient, the probability of success is modest. To achieve a higher probability of finding a minimum cut, we can increase the number of repetitions, thereby increasing the total runtime. We showed that repeating the algorithm \(O(n^2 \log n)\) times allows us to achieve a very high probability of success, making the randomized Min-Cut algorithm a practical and effective solution in many scenarios.
Remark. Remark 1. The randomized Min-Cut algorithm exemplifies a common paradigm in algorithm design: trading determinism for simplicity and efficiency. Unlike deterministic algorithms that guarantee finding the minimum cut but may be more complex and computationally intensive, our randomized algorithm offers a probabilistic guarantee. It is remarkably simple to implement and efficient for each run, and by accepting a small probability of error, we can achieve a solution in significantly less time than some deterministic counterparts. This trade-off is particularly valuable in applications where near-optimal solutions with high probability are acceptable, or when dealing with very large graphs where efficiency is paramount.
In summary, the randomized Min-Cut algorithm provides a compelling example of how randomness can be leveraged to design efficient algorithms for computationally challenging problems. The analysis using conditional probability not only allows us to understand the algorithm’s behavior but also to quantify its probability of successand guide the choice of parameters, such as the number of repetitions, to meet desired levels of confidence in the solution. This approach underscores the power of randomized algorithms as a valuable tool in computer science and algorithm design, especially when seeking practical solutions with strong probabilistic guarantees.
Exercises
Cycle Graph Min-Cut: Consider a cycle graph with \(n\) vertices (\(n \ge 3\)).
What is the minimum cut size? Solution: The minimum cut size is 2.
What is the probability that the randomized Min-Cut algorithm finds a minimum cut in one run for a cycle graph? Solution: The probability is \(\frac{2}{n(n-1)}\).
Necessity of Self-loop Removal: Explain why removing self-loops after edge contraction is necessary for the correctness of the Min-Cut algorithm. Solution: Self-loops must be removed after edge contraction to ensure that the count of edges between the remaining two vertices at the end of the algorithm correctly represents a cut in the original graph. Self-loops are artifacts of the contraction process and do not correspond to edges crossing a cut in the initial graph.
Finding All Min-Cuts: How would you modify the randomized Min-Cut algorithm to find all minimum cuts in a graph? (This is a more challenging problem). Discussion: Finding all minimum cuts requires more sophisticated techniques than simply repeating the basic randomized algorithm. Possible approaches include recursive methods, network flow analysis, or exploring critical edges. Further research is needed for an efficient randomized algorithm to find all min-cuts.
Implementation and Experimental Verification: Implement the randomized Min-Cut algorithm and test it on small example graphs. Experimentally verify the probability of success. Guidance: Implement the algorithm, test it on cycle graphs and other small graphs, and compare the experimental success rate with the theoretical probability, especially for cycle graphs where the probability of success is exactly \(\frac{2}{n(n-1)}\).