Non-deterministic Space Complexity and the Immerman-Szelepcsényi Theorem
Introduction
This lecture focuses on the relatively low importance of non-determinism in the context of space complexity, as opposed to time complexity. We will explore the concept of the complement of complexity classes and demonstrate a significant result known as the Immerman-Szelepcsényi theorem. The central problem we will investigate is counting the number of nodes reachable from a given node in a graph using a non-deterministic machine that is efficient in terms of space. This problem is crucial for understanding the relationship between deterministic and non-deterministic space complexity classes.
Space Complexity: Analyzing the amount of memory (space) an algorithm requires to solve a computational problem as a function of the input size.
Non-determinism: A computational model where an algorithm can explore multiple possible execution paths simultaneously, accepting if at least one path leads to acceptance.
Complement of a Complexity Class: For a given complexity class, its complement contains the complements of all languages in that class. For example, if a language \(L\) is in complexity class \(C\), then its complement, \(\overline{L}\), is in co-\(C\).
Immerman-Szelepcsényi Theorem: A fundamental theorem in complexity theory stating that non-deterministic space complexity classes are closed under complementation, i.e., for proper complexity functions \(f(n) \geq \log n\), \(\text{NSPACE}(f(n)) = \text{co-NSPACE}(f(n))\).
Non-deterministic Computation of a Function
When dealing with non-deterministic machines, we usually focus on decision problems where the output is either "yes" or "no." However, we can extend this concept to compute functions. In this case, the non-deterministic machine will have an input tape and an output tape. The input tape is read-only and contains the input, while the output tape is write-only and will contain the function’s output at the end of a computation.
A non-deterministic Turing machine \(M\) computes a function \(f : \Sigma^* \rightarrow \mathbb{N}\) if, for every input \(x \in \Sigma^*\), the following conditions hold:
Termination: Each computation branch of \(M\) on input \(x\) terminates in a final state.
Correct Output: If a computation branch terminates in the "yes" state with output \(m\) on the output tape, then \(m = f(x)\).
Irrelevant "no" Outputs: If a computation branch terminates in the "no" state, the content of the output tape is disregarded.
Existence of Correct Computation: There exists at least one computation branch that terminates in the "yes" state (and thus outputs \(f(x)\)).
Essentially, the machine must produce the correct output on at least one of its "yes" branches, while "no" branches do not provide meaningful output. The output is written on the output tape in binary. Note that this is different from the usual definition of non-deterministic machines, which only decide languages and do not compute functions.
Non-deterministic machines can compute functions by outputting a value on a dedicated output tape.
Only "yes" branches provide meaningful output, which must be the correct function value.
"No" branches are irrelevant in terms of output value.
At least one "yes" branch must exist for each input, and it must produce the correct output.
The Immerman-Szelepcsényi Theorem
The main result of this lecture is the Immerman-Szelepcsényi theorem, which provides an algorithm for counting the number of reachable nodes in a graph using logarithmic space.
Theorem 1 (Immerman-Szelepcsényi). The number of nodes reachable from a node \(U\) in a graph \(G\) can be computed non-deterministically in space \(O(\log n)\), where \(n\) is the number of nodes in \(G\).
This theorem has profound implications for our understanding of non-deterministic space complexity. Before delving into the algorithm, let’s first examine why a naive approach based on the standard non-deterministic reachability algorithm fails.
Challenges in Applying Non-deterministic Reachability
A naive approach to counting reachable nodes might involve adapting the non-deterministic reachability algorithm. However, this approach faces significant challenges.
Example 2. Consider a graph \(G\) with nodes \(\{1, 2, 3, 4, 5, 6, 7, 8\}\) and let \(U = 3\). Suppose we want to count the nodes reachable from \(U\).
The correct number of reachable nodes from node 3 is 6 (nodes 3, 5, 4, 6, 7, 8). If we try to reach each node non-deterministically and increment a counter, we might miss some nodes due to incorrect non-deterministic choices. For instance, we might reach node 5 and increment the counter, but another computation might fail to reach node 5, leaving the counter unchanged. This can lead to different computations terminating with different counts, making it difficult to determine the correct answer. In this specific example, one non-deterministic computation might terminate with the counter at a value less than 6, while the correct computation would terminate with the counter at 6. The problem is that we cannot distinguish between these computations without using more than logarithmic space.
The Immerman-Szelepcsényi Algorithm
The Immerman-Szelepcsényi algorithm overcomes these challenges by iteratively computing the number of nodes reachable in at most \(i\) steps, denoted by \(|S_i|\).
\(S_0 = \{U\}\), so \(|S_0| = 1\).
We compute \(|S_i|\) from \(|S_{i-1}|\) iteratively.
Let \(n = |V|\) be the number of nodes in the graph. The algorithm proceeds as follows:
\(S_0 \gets \{U\}\) \(|S_0| \gets 1\) \(L \gets 0\) \(M \gets 0\) \(reply \gets \text{false}\) Non-deterministically guess a path of length at most \(i-1\) from \(U\) to \(V\) \(M \gets M + 1\) \(reply \gets \text{true}\) Terminate with "no" \(L \gets L + 1\) \(|S_i| \gets L\) \(|S_{n-1}|\)
Explanation:
The algorithm iteratively computes \(|S_i|\), the number of nodes reachable from \(U\) in at most \(i\) steps. In each iteration, it checks every node \(X\) to see if it’s reachable in at most \(i\) steps. To do this, it iterates through all nodes \(V\) and non-deterministically checks if \(V\) is in \(S_{i-1}\). If a path of length at most \(i-1\) is found from \(U\) to \(V\), the counter \(M\) is incremented. Crucially, if at the end of the inner loop \(M\) is less than \(|S_{i-1}|\), it means that not all nodes in \(S_{i-1}\) were found, indicating an incorrect non-deterministic guess. In this case, the computation branch is terminated with "no".
If a node \(V\) is found to be in \(S_{i-1}\), the algorithm checks if \(X\) can be reached from \(V\) in one step (either \(V=X\) or there’s an edge from \(V\) to \(X\)). If so, \(reply\) is set to true, and \(L\) is incremented. At the end of the outer loop, \(L\) holds the correct value of \(|S_i|\).
Space Complexity Analysis
The algorithm uses the following variables:
\(i, L, X, V, M\): These are counters, each requiring at most \(O(\log n)\) space.
\(reply\): A boolean variable, requiring \(O(1)\) space.
\(|S_{i-1}|\): Storing the count of nodes in the previous iteration, requiring \(O(\log n)\) space.
Path from \(U\) to \(V\): Storing a path of length at most \(i-1\), which requires at most \(O(\log n)\) space since we only need to store the sequence of nodes.
Therefore, the total space used by the algorithm is \(O(\log n)\).
The algorithm cleverly uses non-determinism to verify the count of nodes reachable in the previous iteration (\(|S_{i-1}|\)).
By terminating incorrect branches (where \(M < |S_{i-1}|\)), the algorithm ensures that only the correct count of reachable nodes is propagated.
The space efficiency comes from only storing a constant number of variables, each requiring logarithmic space.
The algorithm demonstrates that non-determinism can be used effectively to solve problems in logarithmic space, even when a direct adaptation of a non-deterministic algorithm fails.
Corollary: Complement of Non-deterministic Space
A significant consequence of the Immerman-Szelepcsényi theorem is that non-deterministic space complexity classes are closed under complementation. This means that if a language is decidable by a non-deterministic machine within a certain space bound, then its complement is also decidable within the same space bound.
Corollary 3. If \(f\) is a proper complexity function with \(f(n) \geq \log n\), then co-NSPACE\((f(n)) =\) NSPACE\((f(n))\).
This corollary states that the class of problems that can be solved by a non-deterministic Turing machine using space \(f(n)\) is the same as the class of problems whose complements can be solved by a non-deterministic Turing machine using the same space bound.
Proof. Proof. To prove this, we need to show two inclusions:
co-NSPACE\((f(n)) \subseteq\) NSPACE\((f(n))\)
NSPACE\((f(n)) \subseteq\) co-NSPACE\((f(n))\)
The second inclusion follows from the first by symmetry, so we only need to prove the first.
Let \(L \in\) co-NSPACE\((f(n))\). Then \(\overline{L} \in\) NSPACE\((f(n))\). This means there exists a non-deterministic Turing machine \(N\) that decides \(\overline{L}\) using space \(f(n)\). We want to show that \(L\) is also in NSPACE\((f(n))\).
We know that \(X \in \overline{L}\) if and only if the initial configuration of \(N\) on input \(X\) can reach a "yes" configuration in the configuration graph \(G_{N,X}\). The size of \(G_{N,X}\) is bounded by \(O(c^{\log n + f(n)})\) for some constant \(c\), as there are at most \(c^{\log n + f(n)}\) possible configurations of \(N\) on input \(x\) of length \(n\).
Now, \(X \in L\) if and only if \(X \notin \overline{L}\), which means the initial configuration of \(N\) on input \(X\) cannot reach a "yes" configuration in \(G_{N,X}\).
We will construct a non-deterministic machine \(M\) that decides \(L\) in space \(f(n)\). \(M\) will use the Immerman-Szelepcsényi algorithm to count the number of configurations reachable from the initial configuration in \(G_{N,X}\).
\(M\) simulates the Immerman-Szelepcsényi algorithm on \(G_{N,X}\) with the initial configuration as the starting node.
During the simulation, whenever a "yes" configuration is reached, \(M\) rejects (because this means \(X \in \overline{L}\)).
If the algorithm finishes and the computed number of reachable configurations equals the total number of reachable configurations without reaching a "yes" configuration, \(M\) accepts.
The space used by \(M\) is the space used by the Immerman-Szelepcsényi algorithm, which is logarithmic in the size of the configuration graph. Thus, the space used is \(O(\log(c^{\log n + f(n)})) = O(\log n + f(n))\). Since \(f(n) \geq \log n\), this simplifies to \(O(f(n))\).
Therefore, \(M\) decides \(L\) in space \(O(f(n))\), so \(L \in\) NSPACE\((f(n))\). This implies co-NSPACE\((f(n)) \subseteq\) NSPACE\((f(n))\).
Since the other inclusion follows by symmetry, we conclude that co-NSPACE\((f(n)) =\) NSPACE\((f(n))\). ◻
The Immerman-Szelepcsényi theorem implies that non-deterministic space classes are closed under complementation.
This result highlights a significant difference between space and time complexity, as non-deterministic time classes are not known to be closed under complementation (it is widely believed that \(\text{NP} \neq \text{co-NP}\)).
The proof relies on the ability to count reachable configurations using the Immerman-Szelepcsényi algorithm and the fact that the configuration graph can be generated on-the-fly without explicitly storing it.
This result further reinforces the idea that non-determinism is less powerful in the context of space complexity compared to time complexity.
Conclusion
The Immerman-Szelepcsényi theorem is a powerful result in the realm of non-deterministic space complexity. By showing that the number of reachable nodes in a graph can be computed in logarithmic space non-deterministically, it establishes that the complement of a non-deterministic space complexity class is equal to the class itself. This highlights the relatively low importance of non-determinism in space complexity compared to time complexity. The algorithm’s ingenuity lies in its ability to leverage non-determinism to verify the completeness of its search for reachable nodes, ensuring that only the correct count leads to a "yes" output.
The Immerman-Szelepcsényi theorem provides a non-deterministic \(O(\log n)\) space algorithm for counting reachable nodes in a graph.
Non-deterministic space complexity classes are closed under complementation, i.e., for proper \(f(n) \geq \log n\), co-NSPACE\((f(n)) =\) NSPACE\((f(n))\).
Non-determinism has a relatively lower impact on space complexity compared to time complexity.
The configuration graph used in the proof of the corollary can be generated on-the-fly, requiring only logarithmic space.
Follow-up Questions:
Relation to Savitch’s Theorem: How does the Immerman-Szelepcsényi theorem relate to Savitch’s theorem?
- Detail missing in the lecture: Savitch’s theorem states that \(\text{NSPACE}(f(n)) \subseteq \text{DSPACE}(f(n)^2)\) for any proper complexity function \(f(n) \geq \log n\). While both theorems deal with non-deterministic space, Savitch’s theorem provides a deterministic simulation of non-deterministic space, whereas Immerman-Szelepcsényi shows that non-deterministic space is closed under complement. They provide different insights into the power of non-determinism in space-bounded computation.
Adaptability to Other Graph Problems: Can the Immerman-Szelepcsényi algorithm be adapted to solve other graph problems efficiently in terms of space?
- Detail missing in the lecture: The core idea of iteratively computing reachable sets and verifying the count non-deterministically might be applicable to other problems. However, the specific adaptation would depend on the problem’s structure and whether a similar inductive approach can be formulated. Further research is needed to explore these possibilities.
Implications for Deterministic vs. Non-deterministic Space: What are the implications of this theorem for the relationship between deterministic and non-deterministic space complexity classes?
- Detail missing in the lecture: The theorem implies that \(\text{NL} = \text{co-NL}\), which is a significant result. It suggests that non-determinism is less powerful in space complexity than in time complexity, where it is widely believed that \(\text{P} \neq \text{NP}\) and \(\text{NP} \neq \text{co-NP}\). However, it does not prove that \(\text{L} = \text{NL}\), which remains an open problem.
The Immerman-Szelepcsényi algorithm is a significant contribution to theoretical computer science, earning its authors the prestigious Gödel Prize in 1995. This recognition underscores the importance of their work, alongside other notable achievements like Shor’s algorithm for factorization (Gödel Prize in 2000) and the polynomial-time algorithm for primality testing (Gödel Prize in 2006). The Immerman-Szelepcsényi theorem not only provides a practical algorithm but also deepens our understanding of the fundamental properties of space complexity.